Paper
28 April 2023 Comparison and analysis of computer vision models based on images of catamount and canid
Feng Jiang, Yueyufei Ma, Langyu Wang
Author Affiliations +
Proceedings Volume 12610, Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022); 126101R (2023) https://doi.org/10.1117/12.2671468
Event: Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022), 2022, Wuhan, China
Abstract
Nowadays, target recognition, driverless, medical impact diagnosis, and other applications based on image recognition in life, scientific research, and work, rely mainly on a variety of large models with excellent performance, from the Convolutional Neural Network (CNN) at the beginning to the various variants of the classical model proposed now. In this paper, we will take the example of identifying catamount and canid datasets, comparing the efficiency and accuracy of CNN, Vision Transformer (ViT), and Swin Transformer laterally. We plan to run 25 epochs for each model and record the accuracy and time consumption separately. After the experiments we find that from the comparison of the epoch numbers and the real-time consumption, the CNN takes the least total time, followed by Swin Transformer. Also, ViT takes the least time to reach convergence, while Swin Transformer takes the most time. In terms of training accuracy, ViT has the highest training accuracy, followed by Swin Transformer, and CNN has the lowest training accuracy; the validation accuracy is similar to the training accuracy. ViT has the highest accuracy, but takes the longest time; conversely, CNN takes the shortest time and has the lowest accuracy. Swin Transformer, which seems a combination of CNN and ViT, is most complex but with ideal performance. In the future, ViT is indeed a promising model that deserves further research and exploration to contribute to the computer vision field.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Feng Jiang, Yueyufei Ma, and Langyu Wang "Comparison and analysis of computer vision models based on images of catamount and canid", Proc. SPIE 12610, Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022), 126101R (28 April 2023); https://doi.org/10.1117/12.2671468
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Education and training

Visual process modeling

Computer vision technology

Data modeling

Image processing

Image segmentation

Back to Top