Musical Instrument Classification Using Embedded Hidden Markov Models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33122
Musical Instrument Classification Using Embedded Hidden Markov Models

Authors: Ehsan Amid, Sina Rezaei Aghdam

Abstract:

In this paper, a novel method for recognition of musical instruments in a polyphonic music is presented by using an embedded hidden Markov model (EHMM). EHMM is a doubly embedded HMM structure where each state of the external HMM is an independent HMM. The classification is accomplished for two different internal HMM structures where GMMs are used as likelihood estimators for the internal HMMs. The results are compared to those achieved by an artificial neural network with two hidden layers. Appropriate classification accuracies were achieved both for solo instrument performance and instrument combinations which demonstrates that the new approach outperforms the similar classification methods by means of the dynamic of the signal.

Keywords: hidden Markov model (HMM), embedded hidden Markov models (EHMM), MFCC, musical instrument.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1056368

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1897

References:


[1] L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition", Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
[2] Xuedong Huang, Alejandro Acero, Alex Acero and Hsiao-Wuen Hon, Spoken language processing: a guide to theory, algorithm, and system development, Prentice Hall PTR, 2001.
[3] Lawrence R. Rabiner, Biing-Hwang Juang, Fundamentals of Speech Recognition, Pearson Education, 1993.
[4] Jun Wu, E. Vincent, S. A. Raczynski, T. Nishimoto, N. Ono and S. Sagayama, "Polyphonic Pitch Estimation and Instrument Identification by Joint Modeling of Sustained and Attack Sounds", Selected Topics in Signal Processing, IEEE Journal of , vol.5, no.6, pp.1124-1132, Oct. 2011
[5] J. J. Aucouturier and M. Sandler, "Segmentation of musical signals using hidden Markov models", presented at the 110th Conv. Audio Eng. Soc., May 2001.
[6] T. Virtanen and T. Heittola, "Interpolating hidden Markov model and its application to automatic instrument recognition", Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on , vol., no., pp.49-52, 19-24 April 2009.
[7] C. Raphael, "Automatic segmentation of acoustic musical signals using hidden Markov models", IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 4, pp. 360370, Apr. 1999.
[8] A. Eronen, "Musical instrument recognition using ICA-based transform of features and discriminatively trained HMMs", Signal Processing and Its Applications, 2003. Proceedings. Seventh International Symposium on , vol.2, no., pp. 133- 136 vol.2, 1-4 July 2003.
[9] Jonghyun Lee and Joohwan Chun, "Musical instruments recognition using hidden Markov model", Signals, Systems and Computers, 2002. Conference Record of the Thirty-Sixth Asilomar Conference on , vol.1, no., pp.196-199 vol.1, 3-6 Nov. 2002.
[10] N. Degara, M. E. P. Davies, A. Pena and M. D. Plumbley, "Onset Event Decoding Exploiting the Rhythmic Structure of Polyphonic Music", Selected Topics in Signal Processing, IEEE Journal of , vol.5, no.6, pp.1228-1239, Oct. 2011.
[11] Yuting Qi, J. W. Paisley, L. Carin, "Music Analysis Using Hidden Markov Mixture Models", Signal Processing, IEEE Transactions on , vol.55, no.11, pp.5209-5224, Nov. 2007.
[12] R. J. Weiss and J. P. Bello, "Unsupervised Discovery of Temporal Structure in Music", Selected Topics in Signal Processing, IEEE Journal of , vol.5, no.6, pp.1240-1251, Oct. 2011.
[13] A. Pikrakis, S. Theodoridis, and D. Kamarotos, "Classification of musical patterns using variable duration hidden Markov models", IEEE Trans. Audio, Speech, Lang. Process. ,voI.14, pp.1795-1807, 2006.
[14] Jean-Julien Aucouturier and Mark Sandler, "Segmentation of Musical Signals Using Hidden Markov Models", Presented at the 110th Convention, Amsterdam, The Netherlands, 12-15 May 2001.
[15] Kai Shen, Sheng Gao, Peiqi Chai and Q. Sun, "Music Identification Using Embedded HMM", Multimedia Signal Processing, 2005 IEEE 7th Workshop on , vol., no., pp.1-4, Oct. 30 2005-Nov. 2 2005.
[16] G. D. Forney, "The Viterbi algorithm", Proc. IEEE, vol.61, pp. 268- 278, Mar. 1973.
[17] A. Eronen and A. Klapuri, "Musical instrument recognition using cepstral coefcients and temporal features", in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2000, vol. 2, pp. 753-756.
[18] J. C. Brown, "Computer identication of musical instruments using pattern recognition with cepstral coefcients as features", J. Acoust. Soc. Amer., vol. 105, no. 3, pp. 19331941, 1999.
[19] E. Vincent and X. Rodet, "Instrument identication in solo and ensemble music using independent subspace analysis", in Proc. Int. Conf. Music Inf. Retrieval (ISMIR), 2004, pp. 576-581.
[20] A. Eronen, "Comparison of features for musical instrument recognition", Applications of Signal Processing to Audio and Acoustics, 2001 IEEE Workshop on the , vol., no., pp.19-22, 2001.
[21] A. Eronen and A. Klapuri; , "Musical instrument recognition using cepstral coefficients and temporal features", Acoustics, Speech, and Signal Processing, 2000. ICASSP -00. Proceedings. 2000 IEEE International Conference on , vol.2, no., pp.II753-II756 vol.2, 2000.
[22] Beth Logan, "Mel frequency cepstral coefficients for music modeling", In International Symposium on Music Information Retrieval, 2000.
[23] Monson H. Hayes, Statistical digital signal processing and modeling, John Wiley & Sons, Inc., 1996.
[24] N. C. Maddage, "Automatic structure detection for popular music", Multimedia, IEEE , vol.13, no.1, pp. 65- 77, Jan.-March 2006.
[25] Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei, "Hierarchical Dirichlet processes", J. Amer. Statist. Assoc., vol. 101, pp. 15661581, 2006.
[26] Katrin Weber, "HMM Mixtures (HMM2) for Robust Speech Recognition", http://www.idiap.ch/publications/weberrr-0334.bib.abs.html, 2003.
[27] J. Marques and P. Moreno, "A study of musical instrument classification using gaussian mixture models and support vector machines", Compaq Computer Corporation, Tech. Rep. CRL 99/4, 1999.
[28] S. S. Stevens and J. Volkman, "The Relation of Pitch to Frequency", Journal of Psychology, 1940, 53, pp. 329.
[29] L. E. Baum, "An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes", Inequalities, vol. 3, pp. 1-8, 1972.
[30] S. E. Levinson, L. R. Rabiner and M. M. Sondhi, "An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition", Bell Syst. Tech. J., vol. 62, no. 4, pp. 1035-1074, Apr. 1983.
[31] Xin Zhang and Z. W. Ras, "Analysis of Sound Features for Music Timbre Recognition", Multimedia and Ubiquitous Engineering, 2007. MUE -07. International Conference on , vol., no., pp.3-8, 26-28 April 2007.
[32] G. Tzanetakis and P. Cook, "Musical genre classification of audio signals", Speech and Audio Processing, IEEE Transactions on , vol.10, no.5, pp. 293- 302, Jul 2002.
[33] Changsheng Xu, N. C. Maddage and Xi Shao, "Automatic music classification and summarization", Speech and Audio Processing, IEEE Transactions on , vol.13, no.3, pp. 441- 450, May 2005.