Effective Digital Music Retrieval System through Content-based Features

Bokyung Sung; Kwanghyo Koo; Jungsoo Kim; Myung-Bum Jung; Jinman Kwon; Ilju Ko

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33156

Effective Digital Music Retrieval System through Content-based Features

Authors: Bokyung Sung, Kwanghyo Koo, Jungsoo Kim, Myung-Bum Jung, Jinman Kwon, Ilju Ko

Abstract:

In this paper, we propose effective system for digital music retrieval. We divided proposed system into Client and Server. Client part consists of pre-processing and Content-based feature extraction stages. In pre-processing stage, we minimized Time code Gap that is occurred among same music contents. As content-based feature, first-order differentiated MFCC were used. These presented approximately envelop of music feature sequences. Server part included Music Server and Music Matching stage. Extracted features from 1,000 digital music files were stored in Music Server. In Music Matching stage, we found retrieval result through similarity measure by DTW. In experiment, we used 450 queries. These were made by mixing different compression standards and sound qualities from 50 digital music files. Retrieval accurate indicated 97% and retrieval time was average 15ms in every single query. Out experiment proved that proposed system is effective in retrieve digital music and robust at various user environments of web.

Keywords: Music Retrieval, Content-based, Music Feature and Digital Music.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1080967

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1525

References:

[1] Z. Jun, S. Kwong, W. Gang, and Q. Hong, "Using Mel-Frequency Cepstral Coefficients in Missing Data Technique," EURASIP Journal on Applied Signal Processing, Vol.3, 2004, pp. 340-346.
[2] M. Xu, NC. Maddage, C. Xu, M. Kankanhalli, and Q.Tian, "Creating audio keywords for event detection in soccer video," Proc. International Conference Multimedia and Expo ICME-03, Baltimore, USA, 2003, pp. 281-284.
[3] B.J. Shannon, and K.K. Paliwal, "A Comparative study of filter bank spacing for speech recognition," Proc. International Microelctronic engineering research conference, Brisbane, AUSTRALIA, 2003, pp. 1-3.
[4] Ziyou Xiong, R. Radhakrishnana, A. Divakaran, and T.S. Huang, "Comparing MFCC and MPEG-7 Audio features for feature extraction, maximum likelihood HMM and entropic priop HMM for sports audio classification," Proc. International Conference Multimedia and Expo ICME-03, Baltimore, USA, 2003, pp. 397-400.
[5] J.C. Brown, A. Hodgins-Davis, and P.J.O. Miller, "Classification of vocalizations of killer whales using dynamic time warping," The Journal of the Accoustical Society of America, Vol.119, 2006, pp. EL34-EL40.
[6] C.A. Ratanamahatana, and E. Keogh, "Three myths about dynamic time warping data mining," Proc. SIAM International Conference on Data Mining, Newport Beach, CA, 2005, pp. 506-510.
[7] A.M. Youssef, T.K. Abdel-Galil, E.F. El-Saadany, and M.M.A. Salama, "Disturbance classification utilizing dynamic time warping classifier," IEEE Transactions on Power Delivery, Vol. 19, 2004, pp. 272-278.
[8] A. Pikrakis, S. Theodoridis, and D. Kamarotos, "Recognition of isolated musical patterns using context dependent dynamic time warping," IEEE Speech and Audio Processing, Vol. 11, 2003, pp. 175-183.
[9] E.J. Keogh, and M.J. Pazzani, "Computer Derivative Dynamic Time Warping," Proc. First SIAM International Conference on Data Mining, Chicago, USA, 2001, pp. 1-11.
[10] S. Velusamy, B. Thoshkahna, and K.R. Ramakrishnan, "A Novel Melody Line Identification Algorithm for Polyphonic MIDI Music," Proc. LNCS Advances in Multimedia Modeling, 2006, pp. 248-257.
[11] Ning Hu, R.B. Dannenberg, and G. Tzanetakis, Polyphonic audio matching and alignment for music retrieval. Proc. 2003 IEEE Workshop on Application of Signal Processing to Audio and Acoustics, New York, USA, 2003, pp. 185-188.
[12] R. Clifford, M. christodoulakis, and T. Crawford, "A Fast Randomised Maximal Subset Matching Algorithm for Document-Level Music Retrieval," Proc. 7th International Conference on Music Information Retrieval, Victoria, CA, 2006.
[13] Y.Y. Chung, H.c. Choi, Zhen Zhao, M.A.M. Shukran, Y.S. David, and Fang Chen, "An Efficient tree-based quantization for content based music retrieval system," Proc. 2007 annual Conference on International Conference on Computer Engineering and Applications table of contents, Queensland, AUSTRALIA, 2007, pp. 237-241.
[14] A. Spanias, T. Painter, V. Atti, and J.V. Candy, Audio Signal Processing and Coding (The Journal of the Acoustical Society of America, 2007).
[15] G. Peeters, "Music Pitch Representation by Periodicity Measures Based on Combined Temporal and Spectral Representations," Proc. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, FRANCE, 2006, pp. V53-V56.
[16] Xin Zhang, and W. Ras Zbigniew, "Analysis of Sound Features for Music Timbre Recognition," Proc. 2007 International Conference on Multimedia and Ubiquitous Engineering, Seoul, Korea, 2007.
[17] J.P. Bello, and J. Pickens, "A robust mid-level representation for harmonic content in music signals," Proc. International Symposium on Music Information Retrieval 2005, London, UK, 2005.
[18] M. Marolt, "Melody-based retrieval in audio collections," The Journal of the Acoustical Society of America, Vol. 122, No. 5, 2007.
[19] Xi Shao, N.C. Maddage, Xu Changsheng, and M.S. Kankanhalli, "Automatic music summarization based on music structure analysis," Proc. 2005 IEEE International Conf. on Acoustics, Speech, and Signal Processing, Philadelphia, USA, 2005, pp. 1169-1172.
[20] Yi Yu, J. Stephen Downie, Lei Chen, Vincent Oria, and Kazuki Joe, "Searching musical audio dataset by a batch of multi-variant tracks," Proc. The 1st ACM International Conference on Multimedia information retrieval table of contents, Vancouver, CA, 2008, pp. 121-127.
[21] R. Hu, and R.I. Damper, "Fusion of two classifiers for speaker identification: removing and not removing silence," Proc. 8th International Conference on Information Fusion, Philadelphia, USA, 2005, pp. 429-436.
[22] Xufang Zhao, and D. O-Shaughnessy, "A new hybrid approach for automatic speech signal segmentation using silence signal detection, energy convex hull, and spectral variation," Proc. Canadian Conference on Electrical and Computer Engineering, Ottawa, CA, 2008, pp. 145-148.
[23] Wenjuan Pan, Yong Yao, Zhijing Liu, and Weiyao Huang, "Audio classification in a weighted SVM," Proc. International Symposium on Communications and Information Technologies, Sydney, AUSTRALIA, 2007, pp. 468-472.