Paper
15 August 2023 Scene text image correction based on transformer
Ming Jin, Xiaosheng Tu, Jinfeng Han, Congyan Chen
Author Affiliations +
Proceedings Volume 12719, Second International Conference on Electronic Information Technology (EIT 2023); 127191Q (2023) https://doi.org/10.1117/12.2685812
Event: Second International Conference on Electronic Information Technology (EIT 2023), 2023, Wuhan, China
Abstract
In this work, we propose a new algorithm framework based on Transformer to address the distortion problem of scene text images caused by shooting angles or lighting conditions. This new algorithm framework is called Scene text image transformer (StTr). Specifically, we use a feature extractor containing six convolution modules for feature extraction. Then, we flatten the obtained feature maps and input them into the Transformer encoder and decoder. After obtaining the displacement matrix, we use it to perform geometric correction on the input image. We also use Transformer to eliminate the lighting problem and improve the accuracy of OCR recognition. We evaluated our StTr on different datasets and achieved a 16.73% improvement in accuracy compared to the state-of-the-art methods. Additionally, our model has the advantages of being lightweight and efficient.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ming Jin, Xiaosheng Tu, Jinfeng Han, and Congyan Chen "Scene text image correction based on transformer", Proc. SPIE 12719, Second International Conference on Electronic Information Technology (EIT 2023), 127191Q (15 August 2023); https://doi.org/10.1117/12.2685812
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Light sources and illumination

Optical character recognition

Performance modeling

Education and training

Data modeling

3D modeling

Back to Top