Paper
19 October 2023 Speech emotion recognition based on ResNet-BiGRU network
Penglei Fu, Fuxin Xu, Haoqi Yuan
Author Affiliations +
Proceedings Volume 12709, Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023); 1270914 (2023) https://doi.org/10.1117/12.2684994
Event: Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023), 2023, Nanjing, China
Abstract
In order to improve the anthropomorphic nature of intelligent speech products, the academic research on speech emotion recognition is getting hotter and hotter. Currently, the speech emotion recognition system mainly consists of two steps: speech feature extraction and speech feature classification. In order to improve the accuracy of speech emotion recognition, the Mel Frequency Cepstrum Coefficient (MFCC) of speech signal, which has a good effect on the feature capability in the field of speech at this stage, is chosen as the input of the deep learning network, and the ResNet-BiGRU network based on the attention mechanism is used to extract the MFCC information is extracted using ResNet-BiGRU network based on the attention mechanism. The experimental results show that the introduction of attention mechanism in the model can effectively focus on useful information and reduce the interference of redundant information. The accuracy rate on the Chinese sentiment corpus CASIA reached 84.83%.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Penglei Fu, Fuxin Xu, and Haoqi Yuan "Speech emotion recognition based on ResNet-BiGRU network", Proc. SPIE 12709, Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023), 1270914 (19 October 2023); https://doi.org/10.1117/12.2684994
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Emotion

Speech recognition

Feature extraction

RGB color model

Deep learning

Acoustics

Target recognition

Back to Top