Recently, region-based convolutional neural networks(R-CNNs) have achieved significant success in the field of object detection, but their accuracy is not too high for small objects and similar objects, such as the gestures. To solve this problem, we present an online hard example testing(OHET) technology to evaluate the confidence of the R-CNNs' outputs, and regard those outputs with low confidence as hard examples. In this paper, we proposed a cascaded networks to recognize the gestures. Firstly, we use the region-based fully convolutional neural network(R-FCN), which is capable of the detection for small object, to detect the gestures, and then use the OHET to select the hard examples. To enhance the accuracy of the gesture recognition, we re-classify the hard examples through VGG-19 classification network to obtain the final output of the gesture recognition system. Through the contrast experiments with other methods, we can see that the cascaded networks combined with the OHET reached to the state-of-the-art results of 99.3% mAP on small and similar gestures in complex scenes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.