Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies

K. M. Ravikumar; Balakrishna Reddy; R. Rajagopal; H. C. Nagaraj

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33122

Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies

Authors: K. M. Ravikumar, Balakrishna Reddy, R. Rajagopal, H. C. Nagaraj

Abstract:

Automatic detection of syllable repetition is one of the important parameter in assessing the stuttered speech objectively. The existing method which uses artificial neural network (ANN) requires high levels of agreement as prerequisite before attempting to train and test ANNs to separate fluent and nonfluent. We propose automatic detection method for syllable repetition in read speech for objective assessment of stuttered disfluencies which uses a novel approach and has four stages comprising of segmentation, feature extraction, score matching and decision logic. Feature extraction is implemented using well know Mel frequency Cepstra coefficient (MFCC). Score matching is done using Dynamic Time Warping (DTW) between the syllables. The Decision logic is implemented by Perceptron based on the score given by score matching. Although many methods are available for segmentation, in this paper it is done manually. Here the assessment by human judges on the read speech of 10 adults who stutter are described using corresponding method and the result was 83%.

Keywords: Assessment, DTW, MFCC, Objective, Perceptron, Stuttering.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1073655

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2816

References:

[1] D.Kully and E.Boerg, "An investigation of inter_clinic agreemet in the identification of fluent and stuttered syllables," Journal of fluency disorders, vol.13, pp.309-318, 1988.
[2] Dalouglas O-Shaughnessy, "Speech Communication," Human and Machine, Universities press, second edition, 2001.
[3] E.G.Conture. "Englewood cliffs,new jersey:Prentice-Hall," 2nd edition,1990.
[4] E.Keogh, "Exact indexing of dynamic time warping," .In VLDB.pp.406- 417.Hong Kong, China, 2002.
[5] E.Yairi & B.Lewis, "Disfluencies at the onset of stuttering," Journal of speech & Hearing Research, vol.27, pp.154-159, 1984.
[6] H. Silverman & D.morgan, "the application of dynamic programming to connected speech segmentation," IEEE ASSP Mag.7, no.3, 7-25, 1990.
[7] L. Rabiner and B.H. Juang. "Fundamental of speech recognition," PTR Prentice Hall, Englewood Cliffs, New Jersey, 1993.
[8] Peter Howell, Stevie Sackin, and Kazan Glen, "Development of a Twostage procedure for the Automatic Recognition of Dysfluencies in the speech of children who stutter: I. Psychometric Procedure Appropriate for selection of Training Material for Lexical Dysfluency Classifiers," JSLHR, vol.40, pp.1073-1084, October 1997.
[9] Peter Howell, Stevie Sackin, and Kazan Glen, "Development of a Twostage procedure for the Automatic Recognition of Dysfluencies in the speech of children who stutter: II. ANN Recognition of Repetitions and Prolongations with supplied word segment markers," JSLHR, vol. 40, pp.1085-1096, October 1997.
[10] Tack Mu Kuson and Michael E. Zervakis, "Gaussian Perceptron: Learning Algorithms," IEEE International Conference on Systems, Man, and Cybernetics, vol. 1 pp. 105-110, Oct 1992.
[11] W.Johnson et al., "The onset of stuttering, minneapolies university of minnesata press," 1959.
[12] W.Reichl and G.Ruske, "syllable segmentation of continuous speech with Artificial Neural Networks," In Processing of Eurospeech, Berlin, vol.3, pp.1771-1774, 1993.