Extracting Tongue Shape Dynamics from Magnetic Resonance Image Sequences

María S. Avila-García; John N. Carter; Robert I. Damper

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33156

Extracting Tongue Shape Dynamics from Magnetic Resonance Image Sequences

Authors: María S. Avila-García, John N. Carter, Robert I. Damper

Abstract:

An important problem in speech research is the automatic extraction of information about the shape and dimensions of the vocal tract during real-time speech production. We have previously developed Southampton dynamic magnetic resonance imaging (SDMRI) as an approach to the solution of this problem.However, the SDMRI images are very noisy so that shape extraction is a major challenge. In this paper, we address the problem of tongue shape extraction, which poses difficulties because this is a highly deforming non-parametric shape. We show that combining active shape models with the dynamic Hough transform allows the tongue shape to be reliably tracked in the image sequence.

Keywords: Vocal tract imaging, speech production, active shapemodels, dynamic Hough transform, object tracking.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1334748

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1743

References:

[1] T. Baer, J. C. Gore, S. Boyce, and P. W. Nye, “Application of MRI to the analysis of speech production", Magnetic Resonance Imaging, vol. 5, pp. 1-7, 1987.
[2] T. Baer, J. C. Gore, L. C. Gracco, and P. W. Nye, “Analysis of vocal tract shape and dimension using MRI: Vowels", Journal of the Acoustical Society of America, vol. 90, No. 1, pp. 799-828, 1991.
[3] D. Demolin, S. Hassid, T. Metens, and A. Soquet, “Real-time MRI and articulatory coordination in speech", Comptes Rendus Biologies, vol. 325, No. 4, pp. 547-556, 2002.
[4] S. Narayanan, K. Kayak, S. Lee, A. Seit, and D. Byrd, “An approach to real-time magnetic resonance imaging for speech production", Journal of the Acoustical Society of America, vol. 115, no. 4, pp.1771-1776, 2004.
[5] M. Mohammad, “Dynamic measurements of speech articulators using magnetic resonance imaging", Ph.D. thesis, Department of Electronics and Computer Science, University of Southampton, Southampton, UK, 1999.
[6] J. Roerdink, and M. Zwaan, “Cardiac magnetic resonance imaging by retrospective gating: Mathematical modeling and reconstruction algorithms", Journal of Applied Mathematics, vol. 4, pp. 241-270, 1993.
[7] B. Jennison, and J. Allebach, “Maximum likelihood image reconstruction from Fourier-offset data using the expectationmaximization algorithm", Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Canada, vol. 4, pp. 2597-2600, 1991.
[8] P. Perona, and J. Malik, “Scale-space and edge detection using anisotropic diffusion", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no7, pp. 629-639, 1990.
[9] A. Blake, and M. Isard, “Active Contours: The Application of Techniques from Graphics, Vision, Control Theory and Statistics to Visual Tracking of Shapes in Motion", Springer, Berlin, 1998.
[10] J. Princen, H. J. Illingworth, and J. Kittler, “A formal definition of the Hough Transform: Properties and relationships", Journal of Mathematical Imaging and Vision, vol. 1, pp. 153-168, 1992.
[11] T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, “Active shape models - their training and application", Computer Vision and Image Understanding, vol. 61, no. 1, pp. 38-59, 1995.
[12] M. Mohammad, E. Moore, J. N. Carter, C. H. Shadle, and S. R. Gunn, “Using MRI to image the moving vocal tract during speech". Proceedings of Eurospeech -97, Rhodes, Greece, vol. 4, pp. 2027-2030, 1997.
[13] P. Lappas, J. N. Carter, and R. I. Damper, “Object tracking via the dynamic velocity Hough transform", Proceedings of IEEE International Conference on Image Processing, Thessaloniki, Greece, pp 371-374, 2001.
[14] P. Lappas, J. N. Carter, and R. I. Damper, “Robust evidence-based object tracking", Pattern Recognition Letters, vol. 23, no. 2-3, pp. 253- 260, 2002.