Deep Neural Networks (DNNs) have emerged as a powerful tool for human action recognition, yet their reliance on vast amounts of high-quality labeled data poses significant challenges. A promising alternative is to train the network on generated synthetic data. However, existing synthetic data generation pipelines require complex simulation environments. Our novel solution bypasses this requirement by employing Generative Adversarial Networks (GANs) to generate synthetic data from only a small existing real-world dataset.
Our training pipeline extracts the motion from each training video and augments it across various subject appearances within the training set. This approach increases the diversity in both motion and subject representations, thus significantly enhancing the model's performance. A rigorous evaluation of the model's performance is presented under diverse scenarios, including ground and aerial views. Moreover, an insightful analysis of critical factors influencing human action recognition performance, such as gesture motion diversity and subject appearance, is presented.
The maximum classifier discrepancy method has achieved great success in solving unsupervised domain adaptation tasks for image classification in recent years. Its basic structure consists of a feature generator and two classifiers that aim to maximize the classifier discrepancy while minimizing the generator discrepancy of the target samples. This method improves the performance of the existing adversarial training methods by employing task-specific classifiers that remove the ambiguity in classifying the target samples near the class boundaries. In this paper, we propose a modified network architecture and two training objectives to further boost the performance of the maximum classifier discrepancy method. The first training objective minimizes the feature level discrepancy and forces the generator to generate domain invariant features. This training objective is particularly beneficial when the source and the target domain distributions are vastly different. The second training objective that works at the mini-batch level aims at creating a uniform distribution of the target class predictions by maximizing the entropy of the expectation of the target class predictions. We show through extensive empirical evaluations that the proposed architecture and training objectives significantly improve the performance of the original algorithm. Furthermore, this method also outperforms the state-of-the-art techniques in most unsupervised domain adaptation tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.