Paper
29 July 2024 A survey of multimodal retrieval: from traditional analytics to deep learning
Gang Cheng, Yaxi Wu, Desheng Cao, Qinliang You, Ming Yang, Ziyi Wang
Author Affiliations +
Proceedings Volume 13214, Fourth International Conference on Digital Signal and Computer Communications (DSCC 2024); 132141J (2024) https://doi.org/10.1117/12.3033335
Event: Fourth International Conference on Digital Signal and Computer Communications (DSCC 2024), 2024, Guangzhou, China
Abstract
With the rapid growth of multimedia data, multimodal retrieval has become an important research field. Multimodal retrieval is a retrieval task involving multiple media types (such as text, image, audio, etc.). With the explosion of data in various fields into the Internet, data is presented in various forms such as video, pictures, etc., the single module of data retrieval can no longer meet the needs of information development, and the demand for multi-modal data retrieval is increasing. In order to resolve the problem, by searching and reading literature, this paper analyzes different research methods such as shared representation learning, deep learning multimodal fusion, and hash method, which are needed in the process of multi-modal retrieval, and sorts out and sums up the basic ideas of researchers to solve these problems. Finally, the future development direction and application prospect of multimodal retrieval are prospected. This paper hopes to provide reference and inspiration for the subsequent research and application of multimodal retrieval and promote the development of multimodal retrieval technology.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Gang Cheng, Yaxi Wu, Desheng Cao, Qinliang You, Ming Yang, and Ziyi Wang "A survey of multimodal retrieval: from traditional analytics to deep learning", Proc. SPIE 13214, Fourth International Conference on Digital Signal and Computer Communications (DSCC 2024), 132141J (29 July 2024); https://doi.org/10.1117/12.3033335
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Deep learning

Data modeling

Binary data

Feature extraction

Neural networks

Principal component analysis

Convolution

Back to Top