Paper
7 March 2024 Deep learning-based cross-modal steel plate image detection and recognition
Bingkun Yang, Hongwei Wang, Xiaofeng Yue
Author Affiliations +
Proceedings Volume 13085, MIPPR 2023: Automatic Target Recognition and Navigation; 130850G (2024) https://doi.org/10.1117/12.3000181
Event: Twelfth International Symposium on Multispectral Image Processing and Pattern Recognition (MIPPR2023), 2023, Wuhan, China
Abstract
The advancement and improvement of computer computing power have led to rapid development in the field of artificial intelligence. Intelligent information technology has also garnered attention and promotion in the manufacturing industry. However, existing research lacks consideration for the problems existing in the manufacturing field, mainly due to difficulties in acquiring rare scenario datasets. In the case of steel plate sorting, the detection of corner points plays a crucial role in production efficiency, particularly regarding the issue of steel plate adhesion caused by laser cutting. Considering the scarcity of seam-cut steel plate image data and the powerful generalization ability of cross-modal model GLIP, this study adopts the application approach of cross-modal large models from different fields. Firstly, we established a steel plate dataset with corner point information. And the GLIP model was fine-tuned using weakly supervised learning. Then, the inference results of the large teacher model are used as inputs to the lightweight student model YOLOv8, forming a framework for lightweight deployment in the industry. In our experiments, we first compared the effects of different amounts of data on the GLIP model and then demonstrated that the 20-shot model performs comparably to the full-shot model. In addition, YOLOv8 can recognize corner points that have not been manually annotated or labeled by the GLIP model, demonstrating excellent generalization performance. We conducted comparative verification, which showed the advantages of GLIP in terms of time consumption, manually labeled data volume, and deployment scale. This study fully utilizes a sparsely labeled dataset and cross-modal large models, integrating them with a lightweight object detection model to reduce labeling costs and improve production efficiency. Finally, we propose directions for future work.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Bingkun Yang, Hongwei Wang, and Xiaofeng Yue "Deep learning-based cross-modal steel plate image detection and recognition", Proc. SPIE 13085, MIPPR 2023: Automatic Target Recognition and Navigation, 130850G (7 March 2024); https://doi.org/10.1117/12.3000181
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Object detection

Visual process modeling

Education and training

Machine learning

Visualization

Laser cutting

Back to Top