3D hand keypoints prediction is an important and fundamental task in Human-Computer Interaction. In this paper, we present an approach to predict 3D hand keypoints from single RGB images. Single RGB images are very common in daily life. However, it is challenging to predict 3D hand keypoints using single RGB images, because of depth ambiguities and occlusions. To deal with these challenges, we exploit deep neural networks to predict 3D hand keypoints. So far, there are several methods which predict 3D hand keypoints from single RGB images. Most of them separate the task into three stages. i.e., hand detection, 2D hand keypoints estimation and 3D hand keypoints prediction. We follow the idea and focus on the 2D hand keypoints estimation and 3D hand keypoints prediction. We improve an existing deep-network-based technique and get better results. Specifically, we combine the convolution and deconvolution network to get the pixel-wise estimation of 2D hand keypoints, and propose a new loss function to predict 3D hand keypoints from 2D keypoints. We evaluate our network on several public datasets and get better results than several other methods. Besides, ablation studies demonstrate that our network is valid.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.