Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition
Authors: Jong Han Joo, Jeong Hun Lee, Young Sun Kim, Jae Young Kang, Seung Ho Choi
Abstract:
In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. However, the effects of echo path changes should be considered for eliminating the undesired echoes. We describe a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.
Keywords: Acoustic echo suppression, barge-in, speech recognition, echo path transfer function, initial delay estimator, voice activity detector.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1098920
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2328References:
[1] S. Miyabe, Y. Hinamoto, H. Saruwatari, K. Shikano, and Y. Tatekura, “Interface for barge-in free spoken dialogue system based on sound field reproduction and microphone array,” EURASIP Journal on Advances in Signal Processing, vol. 2007, Article ID 57470, 13 pages.
[2] M. M. Sondhi, “An adaptive echo canceler,” Bell Syst. Tech. J., vol. 46, pp. 497-510, Mar. 1967.
[3] S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE trans. Acoust. Speech Sig. Processing, vol. 27, no. 2, pp. 113–120, Nov. 1979.
[4] C. Avendano, “Acoustic echo suppression in the STFT domain,” in Proc. IEEE Workshop on Application of Signal Processing to Audio and Acoustics, Oct. 2001.
[5] C. Faller and J. Chen, “Suppressing acoustic echo in a spectral envelope space,” IEEE Trans. on Speech and Audio Processing, vol. 13, no. 5, pp. 1048-1062, Sep. 2006.
[6] J. Benesty. D. R. Morgan, and J. H. Cho, “A new class of doubletalk detectors based on cross-correlation, ”IEEE Trans. on Speech and Audio Processing, vol. 8, no. 2, pp. 168-172, 2000.
[7] G. W. Elko, E. Diethorn, and T. G¨ansler, “Room impulse response variation due to thermal fluctuation and its impact on acoustic echo cancellation,” Proc. Intl. Workshop on Acoust. Echo and Noise Control (IWAENC), Kyoto, Japan, pp. 67-70, Sep. 2003.
[8] C. Faller and C. Tournery, “Robust echo control using a simple echo path model,” In Proc. IEEE Int. Conf. Acous., Speech Signal Processing, vol. 5, pp. 281-284, 2006.
[9] T. Aboulnasr and K. Mayyas, “A robust variable step-size LMS-type algorithms: analysis and simulations,” IEEE Trans. on Signal Processing, vol.45, no.3, pp.631-639, Mar. 1997.