A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33087
A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error

Authors: Qianhua He, Weili Zhou, Aiwu Chen

Abstract:

A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.

Keywords: Speech denoising, sparse representation, K-singular value decomposition, orthogonal matching pursuit.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1132661

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1012

References:


[1] Gaikwad V M, Vasekar S S. Survey on quality and intelligibility offered by speech enhancement algorithms(C). 2015 International Conference on Computing Communication Control and Automation, Pune, 2015: 694-697.
[2] R. Martin. Noise power spectral density estimation based on optimal smoothing and minimum statistics (J).IEEE Transactions on Audio, Speech, Language Processing, 2001, 9(5): 504-512.
[3] Kodrasi I, Marquardt D, Doclo S. Curvature-based optimization of the trade-off parameter in the speech distortion weighted multichannel wiener filter (C). 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, 2015: 315-319.
[4] T.Gerkmann. MMSE-optimal enhancement of complex speech coefficients with uncertain prior knowledge of the clean speech phase (C). 2014 IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, 2014: 4478-4482.
[5] Loizou P C. Speech enhancement: theory and practice (M). Florida: CRC Press, 2013: 1-5.
[6] Evans N, Mason J, Liu W, et al.. An assessement on the fundamental limitations of spectral subtraction (C). IEEE International Conference on Acoustics, Speech and Signal Processing, Toulous, 2006: 145-148.
[7] Hilman F, Koji I, Koichi S. Feature normalization based on non-extensive statistics for speech recognition (J). Speech Communication, 2013, 55(5): 587-599.
[8] Wohlberg B. Efficient algorithms for convolutional sparse representations (J). IEEE Transactions on Image Processing, 2016, 25(1): 301-315.
[9] He Yong-jun, Han Ji-qing, Deng Shi-men, et al.. A solution to residual noise in speech denoising with sparse representation (C). 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, 2012: 4653-4656.
[10] Zhao Nan, Xu Xin, Yang Yi. Sparse Representations for Speech Enhancement (J). Chinese Journal of Electronics, 2011, 19(2): 268-272.
[11] Zhao Yan-ping, Zhao Xiao-hui, Wang Bo. A speech enhancement method employing sparse representation of power spectral density (J). Journal of Information and Computational Science, 2013, 10(6): 1705-1714.
[12] Sigg C D, Dikk T, Buhmann J M. Speech enhancement using generative dictionary learning (J).IEEE Transactions on Audio, Speech, Language Processing, 2012, 20(6): 1698-1712.
[13] Sun Lin-hui, Yang Zhen. Speech enhancement based on data-driven dictionary and sparse representation (J). Signal Processing, 2011, 27(12): 1793-1800.
[14] Rangachari, S. and Loizou, P, “A noise estimation algorithm for highly nonstationary environments,” Speech Communication, vol. 28, pp. 220–231, 2006.
[15] Berouti M, Schwartz M, Makhoul J. Enhancement of speech corrupted by acoustic noise (C). IEEE International Conference on Acoustics, Speech and Signal Processing, 1979: 208-211.
[16] Pati Y C, Rezaiifar R, Krishnaprasad P S. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition (C). IEEE Conference Record of The Twenty-Seventh Asilomar Conference on Signals, Systems and Computers, 1993, 40-44.
[17] Aharon M, Elad M, Bruckstein A. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation (J).IEEE Transactions on Audio, Speech, Language Processing, 2006, 54(11): 4311-4322.
[18] Rubinstein R, Zibulevsky M, Elad M. Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit(R). Science Department Technical Report CS, 2008.