Bi-direction co-attention network on visual question answering for blind people

Tung Le; Thong Bui; Huy Tien Nguyen; Minh Le Nguyen

doi:10.1117/12.2623596

4 March 2022 Bi-direction co-attention network on visual question answering for blind people

Tung Le, Thong Bui, Huy Tien Nguyen, Minh Le Nguyen

Proceedings Volume 12084, Fourteenth International Conference on Machine Vision (ICMV 2021); 1208416 (2022) https://doi.org/10.1117/12.2623596
Event: Fourteenth International Conference on Machine Vision (ICMV 2021), 2021, Rome, Italy

Abstract

The visual impairment community especially blind people needs support from advanced technologies to help them with understanding and answering the image content. In the multi-modal area, Visual Question Answering (VQA) is the notable cutting-edge task requiring the combination of images and texts via a co-attention mechanism. Inspired by the Deep Co-attention Layer, we propose a Bi-direction Co-Attention VT-Transformer network to jointly learn visual and textual features simultaneously. Via our system, the relationship and interaction of the modality objects are digested and combined together into the meaningful space. Besides, the consistency of Transformer architecture in both feature extractor and multi-modal attention function is efficient enough to decrease the layer of attention as well as the computation cost. Through the experimental results and ablation studies, our model achieves the promising performance against the existing approaches and uni-direction mechanism in VizWiz-VQA 2020 dataset for blind people.

Citation Download Citation

Tung Le, Thong Bui, Huy Tien Nguyen, and Minh Le Nguyen "Bi-direction co-attention network on visual question answering for blind people", Proc. SPIE 12084, Fourteenth International Conference on Machine Vision (ICMV 2021), 1208416 (4 March 2022); https://doi.org/10.1117/12.2623596

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available