High Quality Speech Coding using Combined Parametric and Perceptual Modules

M. Kulesza; G. Szwoch; A. Czyżewski

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33156

High Quality Speech Coding using Combined Parametric and Perceptual Modules

Authors: M. Kulesza, G. Szwoch, A. Czyżewski

Abstract:

A novel approach to speech coding using the hybrid architecture is presented. Advantages of parametric and perceptual coding methods are utilized together in order to create a speech coding algorithm assuring better signal quality than in traditional CELP parametric codec. Two approaches are discussed. One is based on selection of voiced signal components that are encoded using parametric algorithm, unvoiced components that are encoded perceptually and transients that remain unencoded. The second approach uses perceptual encoding of the residual signal in CELP codec. The algorithm applied for precise transient selection is described. Signal quality achieved using the proposed hybrid codec is compared to quality of some standard speech codecs.

Keywords: CELP residual coding, hybrid codec architecture, perceptual speech coding, speech codecs comparison.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1329392

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1541

References:

[1] Yang M., Low bit rate speech coding, IEEE Potentials, vol. 23, no. 4, pp. 32-36, 2004.
[2] Kulesza M., Szwoch G., Czyżewski A., Improving signal quality in speech codec using hybrid perceptual-parametric algorithm, Multimedia and Network Information Systems- 06, Wroc┼éaw, (submitted for publication).
[3] Ritz C. H., Lossless wideband speech coding, 10th International Conference on Speech Science and Technology, Sydney, Australia, December 2004.
[4] Dong H., Gibson J.D., Structures for SNR scalable speech coding, IEEE Transactions on speech and audio processing, (accepted and to appear) May 2006.
[5] Verma T.S., Levine S.N., Meng T.H., Transient Modeling Synthesis: a flexible analysis/synthesis tool for transient signals. International Computer Music Conference, Greece, 1997.
[6] Chu W.C., Speech Coding Algorithms. Foundation and Evolution of Standardized Coders, John Wiley & Sons, Hoboken 2003.
[7] Goldberg R., Riek L., A Practical Handbook of Speech Coders, CRC Press, Boca Raton 2000.
[8] Kliewer J., Mertins A., Audio subband coding with improved representation of transient signal segments, Proc IX European Signal Processing Conference (EUSICPO-98), Rhodes, Greece, September 1998, pp. 1245-1248.
[9] Babu V. S., Malot A. K., V. M. Vijayachandran V.M., Vinay M. K., Transient Detection for Transform Domain Coders, AES 116th Convention, Berlin, May 2004.
[10] ISO / IEC 14496-3:2001 Information technology - Generic coding of moving pictures and associated audio information: Part 3: Advanced Audio Coding (AAC).
[11] OGG Vorbis Specification: http://xiph.org/vorbis/
[12] Painter T., Spanias A., Perceptual Coding of Digital Audio, Proceedings of IEEE, vol. 88, pp. 451-513, April 2000.
[13] Opticom, Opera your digital ear, User manual, version 3.5, 2002.