Paper
12 October 2022 Improved video classification method based on non-parametric attention combined with self-supervision
Xuchao Gong, Zongmin Li
Author Affiliations +
Proceedings Volume 12342, Fourteenth International Conference on Digital Image Processing (ICDIP 2022); 123421T (2022) https://doi.org/10.1117/12.2643038
Event: Fourteenth International Conference on Digital Image Processing (ICDIP 2022), 2022, Wuhan, China
Abstract
It is worth mentioning that in the video sequence modeling, the best recognition architecture is transformer. The current popular transformer based video classification methods focus on the importance of current features in time sequence. The degree of characterization of simultaneous order is insufficient, and simple data augmentation has unstable classification effect. In this paper we proposed a method of non-parametric attention combined with self-supervised feature construction to further improve video classification. In this method, the non-parametric attention mechanism is constructed in the simultaneous order feature to fit the multi-local extreme value distribution. At the same time, in the process of model learning, the input video is randomly masked in temporal domain and spatial domain, and self-supervised information is added to effectively learn the details and classification information of video content. Experiments using kinetics400, kinetics600 and something V2 datasets show that the algorithm in this paper has better improvement in accuracy than the current optimal method.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xuchao Gong and Zongmin Li "Improved video classification method based on non-parametric attention combined with self-supervision", Proc. SPIE 12342, Fourteenth International Conference on Digital Image Processing (ICDIP 2022), 123421T (12 October 2022); https://doi.org/10.1117/12.2643038
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Transformers

Performance modeling

Back to Top