A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain

Ruili Zhou; Yuesheng Zhu

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33122

A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain

Authors: Ruili Zhou, Yuesheng Zhu

Abstract:

In this paper, a new robust audio fingerprinting algorithm in MP3 compressed domain is proposed with high robustness to time scale modification (TSM). Instead of simply employing short-term information of the MP3 stream, the new algorithm extracts the long-term features in MP3 compressed domain by using the modulation frequency analysis. Our experiment has demonstrated that the proposed method can achieve a hit rate of above 95% in audio retrieval and resist the attack of 20% TSM. It has lower bit error rate (BER) performance compared to the other algorithms. The proposed algorithm can also be used in other compressed domains, such as AAC.

Keywords: Audio Fingerprinting, MP3, Modulation Frequency, TSM

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1055321

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2201

References:

[1] P. Cano, E. Batlle, T. Kalker and J. Haitsma, "A review of algorithms for audio fingerprinting", Journal of VLSI Signal Processing, pp.271-284, 2005.
[2] J. Haitsma and T. Kalker, "A highly robust audio fingerprinting system," Proc. ISMIR, International Conference on Music Information Retrieval, pp.107-115, Oct.2002.
[3] A. C. Ibarrola and E. Chavez, "A robust entropy-based audio fingerprint," proceeding of the IEEE international conference on multimedia and expo, pp.1729-1732, 2006,.
[4] C. J. C.Burges, J.C.Platt and S.Jana, "Distortion discriminant analysis for audio fingerprinting," IEEE Trans. Speech and Audio Processing, vol.11, pp. 165-174, Mar. 2003.
[5] LuCS, "Audio fingerprinting based on analyzing time-frequency Localization of signals," IEEE International Workshop on Multimedia Signal Processing, USA, pp.174-177,2002.
[6] S. Sukittanon, L.E.Atlas and J.W.Pitton, "Modulation-scale analysis for content identification," IEEE Trans. Signal Process, vol.52, no.10, pp.3023-3035, Oct.2004.
[7] Y.H.Jiao, B.Yang, M.Y.Li and X.M.Niu, "MDCT-based perceptual hashing for compressed audio content identification," proc. of the IEEE workshop on multimedia signal processing, pp.381-384, 2007.
[8] Wei Li, Yaduo Liu and Xiangyang Xue, "Robust audio identification for MP3 popular music," ACM SIGIR 2010, pp.27-634. 2010
[9] Y.Wang and M.Vilermo,"A compressed domain beat detector using MP3 audio bit streams," proceeding of the ACM international conference on multimedia (ACM Multimedia 2001), pp.-202. 2001
[10] Jarina R, OConnor N, Marlow S, Murphy N, "Rhythm Detection for Speech-Music Discrimination in MPEG Compressed Domain," the 14th Intl Conf on Digital Signal Processing,Greece,pp.1-3 July 2002.
[11] A.D'Aguanno and G.Vercellesim, "Tempo induction algorithm in MP3 compressed domain," proceeding of the ACM international conference on multimedia information retrieval, pp.153-158. 2007.
[12] Kim, H.G., Kim,J.Y.&Park,T."Video bookmark based on soundtrack identification and two-stage search for interactive-television." IEEE Transactions on Consumer Electronics, 53(4), pp.1712-1717. 2007
[13] S.Schimmel, "Theory of modulation frequency analysis and modulation filtering, with applications to hearing devices," Ph.D. dissertation, Univ. of Washington, Seattle, 2007.
[14] M. Vinton and L. Atlas, "A Scalable And Progressive Audio Codec," in Proc. of ICASSP 2001, pp. 3277-3280, 2001.
[15] T Y Chang. "Research and implementation of MP3 encoding algorithm," Ph.D. dissertation, Taiwan: National Chiao Tung University, 2002.