Paper
7 June 2023 Reinforcement learning transformer for image captioning generation model
Zhaojie Yan
Author Affiliations +
Proceedings Volume 12701, Fifteenth International Conference on Machine Vision (ICMV 2022); 127010L (2023) https://doi.org/10.1117/12.2680670
Event: Fifteenth International Conference on Machine Vision (ICMV 2022), 2022, Rome, Italy
Abstract
Image captioning generation is a combination of the visual domain and natural language processing. The transformer framework has become the mainstream approach. This paper combines reinforcement learning and transformer methods to reward dynamics backpropagation and normalization in the testing phase. Its characteristic is that when the steps of reinforcement learning increase, the agent model has more knowledge of the fully information, which reduces the computing cost of the system. The experimental results show that the reinforcement transformer structure has achieved a certain improvement in speed.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhaojie Yan "Reinforcement learning transformer for image captioning generation model", Proc. SPIE 12701, Fifteenth International Conference on Machine Vision (ICMV 2022), 127010L (7 June 2023); https://doi.org/10.1117/12.2680670
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Visualization

Semantics

Image processing

Information visualization

Visual process modeling

Image retrieval

RELATED CONTENT


Back to Top