Paper
20 January 2021 Attentive deep CNN for speaker verification
Yong-bin Yu, Min-hui Qi, Yi-fan Tang, Quan-xin Deng, Chen-hui Peng, Feng Mai, Tashi Nyima
Author Affiliations +
Proceedings Volume 11719, Twelfth International Conference on Signal Processing Systems; 117190U (2021) https://doi.org/10.1117/12.2581351
Event: Twelfth International Conference on Signal Processing Systems, 2020, Shanghai, China
Abstract
In this paper, an end-to-end speaker verification system based on attentive deep convolutional neural network (CNN) is highlighted. It takes log filter bank coefficients as input and measures speaker similarity between a test utterance and enrollment utterances by cosine similarity for verification. The approach utilizes the channel attention module of convolutional block attention module (CBAM) to increase representation power by giving different weights to feature maps. In addition, softmax is used to pre-train for initializing the weights of the network and tuple-based end-to-end (TE2E) loss function is responsible for fine-tune in evaluation stage, such a strategy not only results in notable improvements over the baseline model but also allows for direct optimization of the evaluation metric. Experimental results on VoxCeleb dataset indicates that proposed model achieves an equal error rate (EER) of 3.83%, which is slightly worse than x-vectors while outperforms i-vectors.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yong-bin Yu, Min-hui Qi, Yi-fan Tang, Quan-xin Deng, Chen-hui Peng, Feng Mai, and Tashi Nyima "Attentive deep CNN for speaker verification", Proc. SPIE 11719, Twelfth International Conference on Signal Processing Systems, 117190U (20 January 2021); https://doi.org/10.1117/12.2581351
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Speaker recognition

Data modeling

Statistical modeling

Performance modeling

Electronic filtering

Information fusion

Optimization (mathematics)

Back to Top