Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30184
Using Different Aspects of the Signings for Appearance-based Sign Language Recognition

Authors: Morteza Zahedi, Philippe Dreuw, Thomas Deselaers, Hermann Ney

Abstract:

Sign language is used by the deaf and hard of hearing people for communication. Automatic sign language recognition is a challenging research area since sign language often is the only way of communication for the deaf people. Sign language includes different components of visual actions made by the signer using the hands, the face, and the torso, to convey his/her meaning. To use different aspects of signs, we combine the different groups of features which have been extracted from the image frames recorded directly by a stationary camera. We combine the features in two levels by employing three techniques. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, or by concatenating feature groups over time and using LDA to choose the most discriminant elements. At the model level, a late fusion of differently trained models can be carried out by a log-linear model combination. In this paper, we investigate these three combination techniques in an automatic sign language recognition system and show that the recognition rate can be significantly improved.

Keywords: American sign language, appearance-based features, Feature combination, Sign language recognition

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1055813

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1079

References:


[1] A. Sixtus, S. Molau, S. Kanthak, R. Schl├╝ter, and H. Ney, "Recent Improvements of the RWTH Large Vocabulary Speech Recognition System on Spontaneous Speech," in Proc. Int. Conf. On Acoustics, Speech and Signal Processing, Istanbul, Turkey, 2000, pp. 1671-1674.
[2] J. Lööf, M. Bisani, C. Gollan, G. Heigold, B. Hoffmeister, Ch. Plahl, R. Schl├╝ter, and H. Ney, "The 2006 RWTH Parliamentary Speeches Transcription System" in Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP 2006), vol. 2, Pittsburgh, PA, 2006, pp. 105-108.
[3] C. Neidle, J. Kegl, D. MacLaughlin, B. Bahan, and R.G. Lee, The Syntax of American Sign Language: Functional Categories and Hierarchical Structure. Cambridge, MA: MIT Press, 2000.
[4] D. Keysers, T. Deselaers, and H. Ney, "Pixel-to-Pixel Matching for Image Recognition using Hungarian Graph Matching," in DAGM 2004, Pattern Recognition, 26th DAGM Symposium, 2004, Lecture Notes in Computer Science, vol. 3175, T{\"u}bingen, Germany, pp. 154-162.
[5] D. Keysers, T. Deselaers, C. Gollan, and H. Ney, "Deformation Models for Image Recognition" IEEE Trans. Pattern Analysis and Machine Intelligence, 2007,vol. 29, pp.1422-1435.
[6] T. Deselaers, H. M├╝ller, P. Clogh, H. Ney, and T. M Lehmann, "The CLEF 2005 Automatic Medical Image Annotation Task," in International Journal of Computer Vision, 2007, vol. 74 , pp. 51-58.
[7] T. Deselaers, D. Keysers, and H. Ney, "FIRE - Flexible Image Retrieval Engine: ImageCLEF 2004 Evaluation," in CLEF 2004, Bath, UK, Lecture Notes in Computer Science, vol.3491 pp.688-698.
[8] T. Deselaers, D. Keysers, and H. Ney, "Discriminative Training for Object Recognition using Image Patches," in IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005, vol. 2, pp. 157-162.
[9] T. Deselaers, A. Hegerath, D. Keysers, and H. Ney, "Sparse Patch- Histograms for Object Classification in Cluttered Images," in DAGM 2006, Pattern Recognition, 28th DAGM Symposium, 2006, Lecture Notes in Computer Science, vol. 4174, T├╝bingen, Germany, pp. 202-211.
[10] M. Zahedi, D. Keysers, and H. Ney, "Appearance-Based Recognition of Words in American Sign Language," in Proceedings of IbPRIA 2005, 2nd Iberian Conference on Pattern Recognition and Image Analysis, Lecture Notes in Computer Science, vol. 3522, Estoril, Portugal, pp. 511-519.
[11] M. Zahedi, D. Keysers, T. Deselaers, and H. Ney, "Combination of Tangent Distance and an Image Distortion Model for Appearance- Based Sign Language Recognition.," in Proceedings of DAGM 2005, 27th Annual meeting of the German Association for Pattern Recognition, Lecture Notes in Computer Science, vol. 3663, Vienna, Austria, pp. 401-408.
[12] M. Zahedi, P. Dreuw, D. Rybach, T. Deselaers, and H. Ney, "Using Geometric Features to Improve Continuous Appearance-based Sign Language Recognition," in Proceedings of BMVC 06, 17th British Maschine Vision, Edinburgh,UK,2006, vol. 3, pp. 1019-1028.
[13] P. Dreuw, D. Rybach, T. Deselaers, M. Zahedi, and H. Ney. Speech Recognition Techniques for a Sign Language Recognition System. In Interspeech 2007, pages 2513-2516, Antwerp, Belgium, August, 2007. ISCA best student paper award Interspeech 2007.
[14] S. Eickeler, A. Kosmala, and G. Rigoll, "Hidden Markov Model Based Continuous Online Gesture Recognition," in Proceedings of Int. Conference on Pattern Recognition (ICPR), Brisbane, 1998, pp. 1206- 1208.
[15] G. Rigoll, A. Kosmala, and S. Eickeler, "High Performance Real-time Gesture Recognition Using Hidden Markov Models," in Proceedings of Iternational Gesture Workshop 1998, Lecture Notes in Computer Science, vol. 1371, Bielefeld, Germany, pp. 69-80.
[16] B. Bauer and H. Hienz, "Relevant Features for Video-based Continuous Sign Language Recognition," in Proceedings of the 4th International Conference Automatic Face and Gesture Recognition 2000, Grenoble, France, pp. 440-445.
[17] P. Dreuw and T. Deselaers and D. Rybach and D. Keysers and H. Ney, "Tracking Using Dynamic Programming for Appearance-Based Sign Language Recognition," in Proceedings of the 7th International Conference of Automatic Face and Gesture Recognition, IEEE, Southampton, UK, 2006, pp. 293-298.
[18] J. R. R. Estes, and V. R. Algazi, "Efficient error free chain coding of binary documents," in Proceedings of Data Compression Conference, Snowbird, Utah, pp. 122-131.
[19] Darnell Moore and Irfan Essa, "Recognizing Multitasked Activities from Video Using Stochastic Context-free Grammar," in Proceedings of 18th national conference on Artificial Intelligence, Edmonton, Alberta, Canada, 2002, pp. 770-776.
[20] C. Vogler and D. Metaxas, "Adapting Hidden Markov Models for ASL Recognition by Using Three-dimentional Computer Vision Methods," in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Orlando, FL, 1997,pp. 156-161.
[21] T. Starner, J. Weaver and A. Pentland, "Real-time American Sign Language Recognition Using Desk and Wearable Computer Based Video," in Transaction of Pattern Analysis and Machine Intelligence, vol. 20(2), pp. 1371-1375.
[22] R. Bowden, D. Windridge, T. Kabir, A. Zisserman, and M. Bardy, "A Linguaistic Feature Vector for the Visual Interpretation of Sign Language," in Proceedings of ECCV 2004, the 8th European Conference on Computer Vision, Prague, Czech Republic, 2004, pp. 391-401.
[23] A. Zolnay, R. Schl├╝ter, and H. Ney, "Acoustic Feature Combination for Robust Speech Recognition," in Proceedings of ICASSP 2005, Int. Conf. Acoustics, Speech, and Signal Processing, Philadelphia, PA, USA, 2005, vol. 1, pp. 457-460.
[24] H. Haeb-Umbach, and H. Ney, "Linear discriminant analysis for improved large vocabulary continuous speech recognition," in Proceedings of ICASSP 1992, Int. Conf. Acoustics, Speech, and Signal Processing,, 1992, pp. 13-16.
[25] P. Beyerlein, "Discriminative model combination," in IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Seattle, WA, 1998, pp. 481- 484.
[26] H. Tolba, A. Selouani, and D. O Shaughnessy, "Auditory-based acoustic distinctive features and spectral cues for automatic speech recognition using a multi-stream paradigm," in IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Orlando, FL , 2002, vol. 1, pp. 837-840.
[27] D. Keysers, and H. Ney, "Linear Discriminant Analysis and Discriminative Log-linear Modeling," in Proceedings of ICPR 2004, 17th Int. Conf. on Pattern Recognition, Cambridge, UK, 2004,vol.1, pp. 156-159.