Deep learning-based cross-modal steel plate image detection and recognition

Bingkun Yang; Hongwei Wang; Xiaofeng Yue

doi:10.1117/12.3000181

7 March 2024 Deep learning-based cross-modal steel plate image detection and recognition

Bingkun Yang, Hongwei Wang, Xiaofeng Yue

Proceedings Volume 13085, MIPPR 2023: Automatic Target Recognition and Navigation; 130850G (2024) https://doi.org/10.1117/12.3000181
Event: Twelfth International Symposium on Multispectral Image Processing and Pattern Recognition (MIPPR2023), 2023, Wuhan, China

Abstract

The advancement and improvement of computer computing power have led to rapid development in the field of artificial intelligence. Intelligent information technology has also garnered attention and promotion in the manufacturing industry. However, existing research lacks consideration for the problems existing in the manufacturing field, mainly due to difficulties in acquiring rare scenario datasets. In the case of steel plate sorting, the detection of corner points plays a crucial role in production efficiency, particularly regarding the issue of steel plate adhesion caused by laser cutting. Considering the scarcity of seam-cut steel plate image data and the powerful generalization ability of cross-modal model GLIP, this study adopts the application approach of cross-modal large models from different fields. Firstly, we established a steel plate dataset with corner point information. And the GLIP model was fine-tuned using weakly supervised learning. Then, the inference results of the large teacher model are used as inputs to the lightweight student model YOLOv8, forming a framework for lightweight deployment in the industry. In our experiments, we first compared the effects of different amounts of data on the GLIP model and then demonstrated that the 20-shot model performs comparably to the full-shot model. In addition, YOLOv8 can recognize corner points that have not been manually annotated or labeled by the GLIP model, demonstrating excellent generalization performance. We conducted comparative verification, which showed the advantages of GLIP in terms of time consumption, manually labeled data volume, and deployment scale. This study fully utilizes a sparsely labeled dataset and cross-modal large models, integrating them with a lightweight object detection model to reduce labeling costs and improve production efficiency. Finally, we propose directions for future work.

(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Bingkun Yang, Hongwei Wang, and Xiaofeng Yue "Deep learning-based cross-modal steel plate image detection and recognition", Proc. SPIE 13085, MIPPR 2023: Automatic Target Recognition and Navigation, 130850G (7 March 2024); https://doi.org/10.1117/12.3000181

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
8 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Data modeling

Object detection

Visual process modeling

Education and training

Machine learning

Visualization

Laser cutting

Show All Keywords

Keywords/Phrases

Search In:

Publication Years