Paper
13 December 2021 Image captioning using relevance attention and ITEM encoding
Hong Liang Zhang, Guang Ming Li
Author Affiliations +
Proceedings Volume 12087, International Conference on Electronic Information Engineering and Computer Technology (EIECT 2021); 120871Z (2021) https://doi.org/10.1117/12.2624699
Event: International Conference on Electronic Information Engineering and Computer Technology (EIECT 2021), 2021, Kunming, China
Abstract
The image caption task aims to generate a corresponding semantic description for an image. The current algorithm of generation captioning has these problems, that is insufficient information understanding in the process of generation image description and insufficient research of the relationship between image features. In order to solve these problems, we propose an image description method using the relevance attention mechanism and ITEM encoding. This model uses ITEM encoding to obtain text features containing contextual information and image information, so it can obtain richer semantic information. Then the relevant attention module is used to capture the relevant features which is a correlation between image features and text semantic information. Finally, the LSTM is used for decoding the relevant features to generate image caption. This paper conducts experiments to verify the effectiveness of the model on the MS COCO data set and Flickr 30k data set. The experimental results show that compared with the baseline model using visual attention, various evaluation indicators of our model have improved significantly.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Hong Liang Zhang and Guang Ming Li "Image captioning using relevance attention and ITEM encoding", Proc. SPIE 12087, International Conference on Electronic Information Engineering and Computer Technology (EIECT 2021), 120871Z (13 December 2021); https://doi.org/10.1117/12.2624699
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Computer programming

Data modeling

Visualization

Information visualization

Image processing

Visual process modeling

Convolutional neural networks

RELATED CONTENT


Back to Top