Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain

Suman Senapati; Goutam Saha

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33122

Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain

Authors: Suman Senapati, Goutam Saha

Abstract:

This work presents a fusion of Log Gabor Wavelet (LGW) and Maximum a Posteriori (MAP) estimator as a speech enhancement tool for acoustical background noise reduction. The probability density function (pdf) of the speech spectral amplitude is approximated by a Generalized Laplacian Distribution (GLD). Compared to earlier estimators the proposed method estimates the underlying statistical model more accurately by appropriately choosing the model parameters of GLD. Experimental results show that the proposed estimator yields a higher improvement in Segmental Signal-to-Noise Ratio (S-SNR) and lower Log-Spectral Distortion (LSD) in two different noisy environments compared to other estimators.

Keywords: Speech Enhancement, Generalized Laplacian Distribution, Log Gabor Wavelet, Bayesian MAP Marginal Estimator.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1329136

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633

References:

[1] Boll, S. F., "Suppression of Acoustic Noise in Speech using Spectral Subtraction", IEEE ASSP, 27(2):113-120, 1979
[2] Y. Ephraim and D. Malah, "Speech Enhancement using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP32, no. 6, pp. 1109-1121, Dec. 1984.
[3] T. H. Dat, K. Takeda and F. Itakura, "Generalized Gamma Modeling of Speech and its Online Estimation for Speech Enhancement", Proceedings of ICASSP-2005, 2005.
[4] R. Martin and C. Breithaupt, "Speech Enhancement in the DFT Domain using Laplacian Speech Priors", in Proc. International Workshop on Acoustic Echo and Noise Control (IWAENC 03), pp. 8790, Kyoto, Japan, Sep. 2003.
[5] R. Martin, "Speech Enhancement Using MMSE Short Time Spectral Estimation with Gamma Distributed Speech Priors", IEEE ICASSP-02, Orlando, Florida, May 2002.
[6] Thomas Lotter and Peter Vary, "Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model", EURASIP Journal on Applied Signal Processing , vol. 2005, Issue 7, Pages 11101126.
[7] C. Breithaupt and R. Martin, "MMSE Estimation of Magnitude-Squared DFT Coefficients with Super-Gaussian Priors", IEEE Proc. Intern. Conf. on Acoustics, Speech and Signal Processing, vol. I, pp. 896-899, April 2003.
[8] Deng, J. Droppo, and A. Acero. "Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features", IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, May 2004, pp. 218-233.
[9] I. Cohen, "Speech Enhancement Using a Noncausal A Priori SNR Estimator", IEEE Signal Processing Letters, Vol. 11, No. 9, Sep. 2004, pp. 725-728.
[10] S. Kamath and P. Loizou, "A Multi-Band Spectral Subtraction Method for Enhancing Speech Corrupted by Colored Noise", In Proceedings International Conference on Acoustics, Speech and Signal Processing, 2002.
[11] E. Zavarehei, S. Vaseghi and Q. Yan, "Speech Enhancement using Kalman Filters for Restoration of Short-Time DFT Trajectories", Automatic Speech Recognition and Understanding (ASRU), 2005 IEEE Workshop, Nov. 27, 2005, Page(s):219 -224.
[12] D. Gabor, "Theory of communication", J. Inst. Electr. Eng. 93, pp. 429457, 1946.
[13] J. Morlet, G. Arens, E. Fourgeau and D. Giard, "Wave Propagation and Sampling Theory -Part II: Sampling theory and complex waves", Geophysics, 47(2):222-236, February 1982.
[14] D. J. Field, "Relations between the statistics of natural images and the response properties of cortical cells", Journal of The Optical Society of America A, 4(12):2379-2394, Dec. 1987.