Paper
12 October 2022 Spectrogram-based speech enhancement by spatial attention generative adversarial networks
Haixin Luo, Shengyu Lu, Qian Wei, Yu Fu, Jindong Tian
Author Affiliations +
Proceedings Volume 12342, Fourteenth International Conference on Digital Image Processing (ICDIP 2022); 123422K (2022) https://doi.org/10.1117/12.2644385
Event: Fourteenth International Conference on Digital Image Processing (ICDIP 2022), 2022, Wuhan, China
Abstract
The spectrogram can clearly show the composition of different frequencies in the speech signal. In this paper, a speech enhancement method based on deep learning image processing is proposed, which optimizes the spectrogram of the laser detected speech signal to achieve speech enhancement. The laser beam emitted by the laser Doppler vibrometer (LDV) is focused on the glass window to detect the vibration caused by sound wave. After conversion, the audio information that causes vibration is obtained. Under the interference of speckle noise and air disturbance, the detected speech signal not only has a low signal-to-noise ratio (SNR) but also has non-stationary noise. In order to overcome the difficulty that traditional methods are difficult to extract weak signals in the case of severe noise interference, we use deep learning to achieve spectrogram noise reduction and speech information enhancement. By processing the spectrogram of noisy speech with the generative adversarial networks (GAN) combined with the spatial attention mechanism and introducing the short-time objective intelligibility (STOI) into the loss function, the laser detected speech signal was successfully enhanced.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Haixin Luo, Shengyu Lu, Qian Wei, Yu Fu, and Jindong Tian "Spectrogram-based speech enhancement by spatial attention generative adversarial networks", Proc. SPIE 12342, Fourteenth International Conference on Digital Image Processing (ICDIP 2022), 123422K (12 October 2022); https://doi.org/10.1117/12.2644385
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Signal processing

Signal detection

Denoising

Image processing

Back to Top