Paper
10 November 2020 Research on perceptual fusion of audio and video based on deep learning
Qing An, Yanhua Chen, Shusen Wu
Author Affiliations +
Proceedings Volume 11584, 2020 International Conference on Image, Video Processing and Artificial Intelligence; 115840F (2020) https://doi.org/10.1117/12.2579682
Event: Third International Conference on Image, Video Processing and Artificial Intelligence, 2020, Shanghai, China
Abstract
In view of the technical problems existing in the perception and fusion of unstructured data such as audio and video, the method of deep learning is used to realize the pixel level sound source location of video by combining the synchronization of audio and video content in the physical scene and analyzing the sound and image jointly. At the same time, aiming at the problem of low resolution and poor quality of face image in video, a recognition method based on face super-resolution is proposed to realize the identity attribute calculation of the target person.
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Qing An, Yanhua Chen, and Shusen Wu "Research on perceptual fusion of audio and video based on deep learning", Proc. SPIE 11584, 2020 International Conference on Image, Video Processing and Artificial Intelligence, 115840F (10 November 2020); https://doi.org/10.1117/12.2579682
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Image segmentation

Facial recognition systems

Detection and tracking algorithms

Image fusion

Image resolution

Information fusion

RELATED CONTENT


Back to Top