7 August 2018 Toward three-dimensional human action recognition using a convolutional neural network with correctness-vigilant regularizer
Jun Ren, Napoleon Reyes, Andre Barczak, Chris Scogings, Mingzhe Liu
Author Affiliations +
Funded by: Chinese Scholarship Council, Youth Innovation Research Team of Sichuan Province
Abstract
Human action recognition is one of the raison d’être for doing human–computer interaction research, as it is highly vital in meeting the demands of modern society, such as automatic video surveillance for security, patient monitoring for recovery, content-based video retrieval, etc. In line with this, deep learning systems are fast becoming the defacto standard for object recognition, video understanding, and pattern recognition due to their inherent powerful feature learning ability from vast amount of data. It makes sense to capitalize on its great success and to further improve it for the complex task of action recognition. One of the contributions in this paper is an effective and yet simple method for encoding the spatiotemporal information from skeleton sequences into what we call temporal kinematic images. In the input encoding scheme, we embed various geometric relational features derived from the skeleton sequence in the form of our proposed skeletal optical flows (SOFs). SOFs collectively represent the variations of kinetic energy, angles between limbs, and pair-wise displacements between joints over consecutive frames of skeleton data, as color variations in the temporal kinematic images. Another contribution is our convolutional neural network with a correctness-vigilant regularizer. It is employed to exploit the discriminative features from the temporal kinematic image for human action recognition. Lastly, we additionally investigated an adaptive label smoothing technique employed toward the end of training iterations. Empirical results show that the efficiency of the proposed method is superior to existing works in terms of the generalizability of the generated model, training convergence speed, and the resulting classification accuracy on nine popular benchmarking datasets, such as MHAD, MSR Activity 3D, HDM05, MSR Daily Activity 3D, and the latest challenging databases, such as UTKinect-Action, NTU RGB+D, Northwestern-UCLA, UWA3DII, and SBU Kinect Interaction datasets.
© 2018 SPIE and IS&T 1017-9909/2018/$25.00 © 2018 SPIE and IS&T
Jun Ren, Napoleon Reyes, Andre Barczak, Chris Scogings, and Mingzhe Liu "Toward three-dimensional human action recognition using a convolutional neural network with correctness-vigilant regularizer," Journal of Electronic Imaging 27(4), 043040 (7 August 2018). https://doi.org/10.1117/1.JEI.27.4.043040
Received: 24 April 2018; Accepted: 16 July 2018; Published: 7 August 2018
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications and 3 patents.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

3D modeling

RGB color model

Kinematics

Databases

Motion models

Convolutional neural networks

Back to Top