Due to the short duration and low intensity, the efficient feature learning is a big challenge for robust facial microexpression (ME) recognition. To achieve the diverse and spatial relation representation, this paper proposes a simple yet effective micro-expression recognition method based on multiscale convolutional fusion and capsule network (MCFCN). Firstly, the apex frame in a ME clip is located by computing the pixel difference of frames, and then the apex frame is processed by an optical flow operator. Secondly, a multi-scale fusion module is introduced to capture diverse ME related details. Then, the micro-expression features are fed into the capsule network for a good description about spatial relation. Finally, the entire ME recognition model is trained and verified on three popular benchmarks (SAMM, SMIC and CASMEII) using the associated standard evaluation protocols. Experimental results show that our method based on MCFCN is superior to the works based on pervious capsule network or other state-of-the-art CNN models.
Micro-expression (ME), which reveals the genuine feelings and motives within human beings, attracts considerable attention in the field of automatic affective recognition. The main challenges for robust micro-expression recognition (MER) are from the short ME duration, low intensity of facial muscle movements, and insufficient samples. To meet these challenges, we propose an optical flow-based deep capsule adversarial domain adaptation network (DCADAN) for MER, which leverages a deep neural network stemming from these speculations. To alleviate the negative impact of the identity related features, optical flow preprocessing is applied to encode the subtle face motion information that is highly related to facial MEs. Then, a deep capsule network is developed to determine the part–whole relationships on optical flow features. To cope with the data deficiency and enhance the generalization capability via domain adaptation, an adversarial discriminator module that enriches the available samples from macro-expression data is integrated into the capsule network to train an expeditious end-to-end deep network. Finally, a simple and yet efficient attention module is embedded to the DCADAN to adaptively aggregate optical flow convolution maps into the primary capsule layers. We evaluate the performance of the entire network on the cross-database ME benchmark (3DB) using the leave-one-subject-out cross-validation. Unweighted F1-score (UF1) and unweighted average recall (UAR) are exploited as the evaluation metrics. The MER based on DCADAN achieves a UF1 score of 0.801 and a UAR score of 0.829 in comparison with a UF1 of 0.788 and a UAR of 0.782 for the updated approach. The comprehensive experimental results show that the incorporation of adversarial domain adaption into the capsule network is feasible and effective for representing discriminative features in ME and the proposed model outperforms state-of-the-art deep learning networks for MER.
Micro-expression, revealing the true emotions and motives, attracts extraordinary attention on automatic facial microexpression recognition (MER). The main challenge of MER is large-scale datasets unavailable to support deep learning training. To this end, this paper proposes an end-to-end transfer model for facial MER based on the difference images. Compared with micro-expression dataset, macro-expression dataset has more samples and is easy to train for deep neural network. Thus, we pre-train the resnet-18 network on relatively large expression datasets to get the good initial backbone module. Then, the difference images based on adaptive key frame is applied to get MER related feature representation for the module input. Finally, the preprocessing difference images are feed into the pre-trained resent-18 network for fine-tuning. Consequently, the proposed method achieves the recognition rates of 74.39% and 76.22% on the CASME2 and SMIC databases, respectively. The experimental results show that the difference image between the onset and key frame can improve the transfer training performance on resnet-18, the proposed MER method outperforms the methods based on traditional hand-crafted features and deep neural networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.