Paper
15 March 2024 Application practice of neural network algorithms in speech recognition technology
Jun Guo
Author Affiliations +
Proceedings Volume 13075, Second International Conference on Physics, Photonics, and Optical Engineering (ICPPOE 2023); 130750J (2024) https://doi.org/10.1117/12.3026298
Event: Second International Conference on Physics, Photonics, and Optical Engineering (ICPPOE 2023), 2023, Kunming, China
Abstract
Speech Recognition (SR) technology, as one of the core technologies of human-computer interaction, aims to enable computers to understand the process of converting speech signals into corresponding text or commands through natural language. With the exponential increase of internet information, the features of massive speech data have significant non-specific differences and noise interference. Common feature extraction and transformation methods are no longer sufficient to meet the current needs of model training and recognition. With the rapid growth of Machine Learning (ML), many researchers use Neural Networks (NN) to solve various problems in the SR field. This article designs a Deep Learning (DL) algorithm based on Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) for SR. Firstly, sample filtering, pre weighting, signal framing, and endpoint detection are performed on the speech signal. Secondly, the MFCC value of the preprocessed data is extracted. Finally, an NN model is trained and constructed, and the trained qualified model is used to complete the recognition of speech features. The experimental results show that the algorithm designed in this paper has a lower error rate for SR and stronger generalization ability, which is of great significance for the study of SR.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Jun Guo "Application practice of neural network algorithms in speech recognition technology", Proc. SPIE 13075, Second International Conference on Physics, Photonics, and Optical Engineering (ICPPOE 2023), 130750J (15 March 2024); https://doi.org/10.1117/12.3026298
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Education and training

Detection and tracking algorithms

Speech recognition

Evolutionary algorithms

Neural networks

Acoustics

Signal processing

Back to Top