Learning discriminative common alignments for cross-modal retrieval

Hui Liu; Xiao-Ping Chen; Rui Hong; Yan Zhou; Tian-Cai Wan; Tai-Li Bai

doi:10.1117/1.JEI.33.2.023022

14 March 2024 Learning discriminative common alignments for cross-modal retrieval

Hui Liu, Xiao-Ping Chen, Rui Hong, Yan Zhou, Tian-Cai Wan, Tai-Li Bai

Author Affiliations +

Journal of Electronic Imaging, Vol. 33, Issue 2, 023022 (March 2024). https://doi.org/10.1117/1.JEI.33.2.023022

Abstract

Cross-modal retrieval aims to find alignment relationships between different modalities and then compute the semantic similarities used for ranking. Because of the data distribution difference and inherent heterogeneity gap between modalities, a classic solution is to learn common representations in the common space, which could preserve the discrimination among the samples from different categories and alleviate the cross-modal discrepancy. To achieve this, we propose a method, termed LDCA, to learn discriminative common alignments based on the modal representations. LDCA utilizes a modality invariance loss that pushes away the hardest negative sample to further reduce the cross-modal discrepancy at the feature level. In addition, LDCA seeks alignments in the label space to improve the intra-modal discrimination by an effective cross-modal label loss. Extensive experiments are conducted on five widely used cross-modal datasets to evaluate the proposed LDCA. The integral experimental results prove the method’s superiority, and the comprehensive analyses verify the effectiveness of the method.

Citation Download Citation

Hui Liu, Xiao-Ping Chen, Rui Hong, Yan Zhou, Tian-Cai Wan, and Tai-Li Bai "Learning discriminative common alignments for cross-modal retrieval," Journal of Electronic Imaging 33(2), 023022 (14 March 2024). https://doi.org/10.1117/1.JEI.33.2.023022

Received: 14 November 2023; Accepted: 11 January 2024; Published: 14 March 2024

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $24.00

Non-members: $28.00 ADD TO CART

JOURNAL ARTICLE
11 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Education and training

Semantics

Feature extraction

Visualization

Ablation

Design

Multimedia

Show All Keywords

Keywords/Phrases

Search In:

Publication Years