Capturing the hidden relationships in 2D pose sequences is crucial for accurate 3D human pose estimation(HPE). Recent studies have shown that frequency domain information, independent of spatio-temporal information, has strong capabilities on representing the pose sequences. However, there are few works exploring more appropriate ways to fuse these different kinds of information. In this paper, we propose an alternating cyclic approach for fusing spatio-temporal information and frequency information to achieve accurate 3D human pose estimation. The designed alternating cyclic fusion network allows for a more comprehensive integration of different features, leading to improved accuracy. By leveraging feature splitting and time-frequency convolution, the existing features are processed more appropriately, and achieving model lightweighting. Experimental results demonstrate that our approach achieves comparable accuracy to state-of-the-art methods while significantly outperforming mainstream methods in terms of model lightweighting. In conclusion, the introduction of frequency domain information is of great significance for pose estimation tasks.
KEYWORDS: 3D modeling, Image enhancement, Education and training, 3D image reconstruction, Feature extraction, Semantics, Modeling, Image processing, Point clouds
3D reconstruction from single hand-drawn sketches can be considered as a task of single-view reconstruction, which faces great challenge of lifting the dimension of the geometric representation of objects. Most of the reconstruction networks are based on deep learning technology with supervised training which are used to suffer from dataset labeling, while selfsupervised sketch-based 3D reconstruction remains challenging. In this paper, we propose a self-supervised 3D reconstruction network for hand-drawn sketch (IASSReNet), which introduces image information as an auxiliary to address the ambiguity and sparsity of sketch. In order to obtain image information, an image generator is firstly designed to provide augmented information for the reconstruction through a sketch feature enhancement module. To integrate the information from sketch and image, we use a spatially corresponding feature transfer module to fuse their feature. Finally, silhouettes are obtained from the predicted 3D mesh, and similarity constraints are applied to the sketch contour to perform self-supervised training on the network. Experimental results on multiple datasets show that our method outperforms other unsupervised methods and is competitive with some supervised methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.