Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32716
An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing

Authors: Aleksandra Zysk, Pawel Badura


Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.

Keywords: Classification, singing, spectral analysis, vocal emission, vocal register.

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1245


[1] J. Large, “Towards an integrated physiologic-acoustic theory of vocal registers,” The NATS Bulletin, vol. 28, pp. 30–35, 1972.
[2] R. L. Whitehead, D. E. Metz, and B. H. Whitehead, “Vibratory patterns of the vocal folds during pulse register phonation,” The Journal of the Acoustical Society of America, vol. 75, no. 4, pp. 1293–1297, Apr. 1984.
[3] J. Stark, Bel Canto: A History of Vocal Pedagogy. University of Toronto Press, 2003.
[4] R. H. Colton, J. K. Casper, and R. Leonard, Understanding Voice Probems: A Physiological Perspective for Diagnosis and Treatment. Lippincott Williams & Wilkins, 2006.
[5] A. Frisell, The Tenor voice: a personal guide to acquring a superior singing technique. Branden Publishing Company, 2007.
[6] G. J. Mysore, R. J. Cassidy, and J. O. Smith, “Singer-dependent falsetto detection for live vocal processing based on support vector classification,” in 2006 Fortieth Asilomar Conference on Signals, Systems and Computers. Institute of Electrical and Electronics Engineers (IEEE), 2006.
[7] C. T. Ishi, K.-I. Sakakibara, H. Ishiguro, and N. Hagita, “A method for automatic detection of vocal fry,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 1, pp. 47–56, Jan. 2008.
[8] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-time Signal Processing (2nd Ed.). Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1999.
[9] B. S. Manjunath, P. Salembier, and T. Sikora, Introduction to MPEG-7: Multimedia Content Description Interface. Wiley & Sons, 2002.
[10] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
[11] S. Arlot and A. Celisse, “A survey of cross-validation procedures for model selection,” Statistics Surveys, vol. 4, pp. 40–79, 2010.