On Preprocessing of Speech Signals
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32807
On Preprocessing of Speech Signals

Authors: Ayaz Keerio, Bhargav Kumar Mitra, Philip Birch, Rupert Young, Chris Chatwin

Abstract:

Preprocessing of speech signals is considered a crucial step in the development of a robust and efficient speech or speaker recognition system. In this paper, we present some popular statistical outlier-detection based strategies to segregate the silence/unvoiced part of the speech signal from the voiced portion. The proposed methods are based on the utilization of the 3 σ edit rule, and the Hampel Identifier which are compared with the conventional techniques: (i) short-time energy (STE) based methods, and (ii) distribution based methods. The results obtained after applying the proposed strategies on some test voice signals are encouraging.

Keywords: STE based methods, Mahalanobis distance, 3 edit σ rule, Hampel Identifier.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1332328

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1652

References:


[1] Saha. G., Chakroborty. S., and Senapati. S, "A new Silence Removal and End Point Detection Algorithm for Speech and Speaker Recognition Applications", in Proc. of Eleventh National Conference on Communications (NCC), IITKharagpur, India, January 28-30, 2005, pp. 291-295.
[2] Mitra. A., Chatterjee. B., Mitra. B. K., "Identification of Primitive Speech Signals using TMS320C54X DSP Processor", in Proc. of Eleventh National Conference on Communications (NCC), IIT- Kharagpur, India, January 28-30, 2005, pp. 286- 290.
[3] Rabiner. L. R., and Juang. B. H., "Fundamentals of Speech Recognition", AT&T, 1993, Prentice-Hall, Inc.
[4] Mitra. B. K., Young. R., Chatwin. C., "On shadow elimination after moving region segmentation based on different threshold selection strategies", Optics and Lasers in Engineering, vol. 45, no. 11, pp. 1088-1093, July 2007.
[5] Atal B., Rabiner L., "A pattern recognition approach to voicedunvoiced- silence classification with applications to speech recognition" Acoustics, Speech, and Signal Processing
[see also IEEE transactions on Signal Processing], vol 24, 3, June 1976, pp. 201-212.
[6] Childers. D. G., Hand. M., Larar. M. J., "Silent and Voiced/Unvoiced/Mixed Excitation (Four Way), Classification of Speech", IEEE Trans. On ASSP, vol 37, 11, Nov 1989, pp1771-74.
[7] Mitra. A, Mitra. B. K., Chatterjee. B., "Recognition of Isolated Speech Signals using Simplified Statistical Parameters", Proceedings of World Academy of Science, Engineering and Technology, vol.8, pp. 151-154, October 2005.
[8] Rabiner. L. R., Schafer. R. W., "Digital Processing of Speech Signals", First Edition, Prentice-Hall.
[9] Duda R. O., Hart. P. E, Strok. D. G., "Pattern Classification", Second Edition, John Wiley and Sons Inc., 2001.
[10] Sarma. V., Venugopal. D., "Studies on pattern recognition approach to voiced-unvoiced-silence classification", IEEE international conference on ICASSP, 78, 3, April 1978, pp. 1-4.
[11] Hawkins D.M, "Identification of Outliers", Great Britain, Chapman and Hall, 1980.
[12] Pearson R.K, "Outliers in process modeling and Identification", IEEE transactions. Consol Syst. Technologies 2001. lo (1) pp55- 63.
[13] Halin G.Z, Shafiro S.S, "Statistical Models in Engineering", USA: Wiley, 1967.
[14] Pearson R.K. "Mining imperfect data dealing with contamination and incomplete records", Philadelphia: SIAM; 2005.