This paper presents and studies an end-to-end Artificial Neural Network (ANN)-based compression framework leveraging bi-directional prediction. Like traditional hybrid codecs in Random Access configuration, this codec
processes video sequences divided in Groups Of Pictures (GOPs) in which each frame can be encoded in Intra or Inter mode. Inter frames, can be bi-predicted, i.e. using past and future previously decoded frames, the selection of the reference frame for prediction are signaled within the bitstream, allowing for efficient hierarchical GOP temporal networks. In particular, we study the benefits of optimizing the compression of the motion information prediction residuals using dedicated auto-encoder models in which the layers are conditioned based on the GOP structure. The network is trained fully end-to-end from scratch. The increase of compression efficiency shows the promises of implementing conditional convolution for bi-directional inter coding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.