With the emergence of Neural Radiance Fields (NeRF), arbitrary view synthesis has made significant progress. However, most existing methods perform well only with low-resolution inputs, and they usually suffer from blurred synthesized views and high memory footprints as the input resolution increases, especially for dynamic scenes. To this end, this paper proposes a novel and effective framework that achieves a super-resolution dynamic NeRF for high-resolution arbitrary view rendering. Specifically, we first use a dynamic NeRF with HexPlane representation to learn a low-resolution neural model of dynamic scenes, which can synthesize low-resolution images from arbitrary views and times. Then, a spatiotemporal consistent super-resolution module is designed to reconstruct high-resolution synthesized views, which adopts a staged training strategy to enable our model with the ability to perceive geometric local context and detail processing. Experimental results demonstrate that our method can effectively generate high-quality super-resolution images from arbitrary viewpoints and times when dealing with dynamic scenes.
The advanced three-dimensional extension of high-efficiency video coding (3D-HEVC) is the latest coding standard for 3D video. The coding of the depth map for 3D-HEVC is very time-consuming. With the development of deep learning, it has become feasible to employ convolutional neural networks (CNNs) to predict the coding unit (CU) division of the depth map. However, there are three types of CU sizes: 64, 32, and 16, which makes it difficult to unify the model. The features of the depth map are very different from the texture map. In view of the aforementioned problems, we propose an adaptive CU size CNNs for fast 3D-HEVC depth map intracoding. We first employ spatial pyramid pooling to fully extract the features of the three types of CUs. Then, we apply the nonlocal self-attention mechanism to make it suitable for depth maps. Compared with the 3D-HEVC reference algorithm, the proposed network reduces the coding time by an average of 35.7%, while the quality degradation of the synthesized virtual view is negligible.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.