KEYWORDS: Visualization, RGB color model, Visual process modeling, Data modeling, Digital signal processing, Classification systems, Video, Performance modeling, Iterated function systems, Network architectures
Zero-shot learning (ZSL) has recently attracted increasing attention in visual tasks like action recognition. We propose a spatiotemporal visual-semantic embedding network (STVSEM) for zero-shot action recognition. First, given the fact that two-stream architecture based action recognition algorithms have achieved excellent results in recent years, the module is assembled to our designed network by simultaneously using the spatial features (e.g., RGB appearance) and optical flow in time domain as visual features to significantly improve the visual expression capability. Then, in order to slightly alleviate the problem of semantic loss that typically occurs in the case of using embedding-based ZSL methods, an autoencoder is introduced to get a better semantic representation and complement semantic relationship information for unseen classes by seen classes. Last but not least, a joint embedding mechanism that explores and exploits the relationships of the visual data and semantic information in an intermediate space is employed to ameliorate the gap between vision and semantics. The experimental results on Charades and UCF101 datasets indicate that the proposed method outperforms the state-of-the-art methods in accuracy, which further demonstrates the effectiveness of our method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.