Real-Time Vision-based Korean Finger Spelling Recognition System
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32804
Real-Time Vision-based Korean Finger Spelling Recognition System

Authors: Anjin Park, Sungju Yun, Jungwhan Kim, Seungk Min, Keechul Jung

Abstract:

Finger spelling is an art of communicating by signs made with fingers, and has been introduced into sign language to serve as a bridge between the sign language and the verbal language. Previous approaches to finger spelling recognition are classified into two categories: glove-based and vision-based approaches. The glove-based approach is simpler and more accurate recognizing work of hand posture than vision-based, yet the interfaces require the user to wear a cumbersome and carry a load of cables that connected the device to a computer. In contrast, the vision-based approaches provide an attractive alternative to the cumbersome interface, and promise more natural and unobtrusive human-computer interaction. The vision-based approaches generally consist of two steps: hand extraction and recognition, and two steps are processed independently. This paper proposes real-time vision-based Korean finger spelling recognition system by integrating hand extraction into recognition. First, we tentatively detect a hand region using CAMShift algorithm. Then fill factor and aspect ratio estimated by width and height estimated by CAMShift are used to choose candidate from database, which can reduce the number of matching in recognition step. To recognize the finger spelling, we use DTW(dynamic time warping) based on modified chain codes, to be robust to scale and orientation variations. In this procedure, since accurate hand regions, without holes and noises, should be extracted to improve the precision, we use graph cuts algorithm that globally minimize the energy function elegantly expressed by Markov random fields (MRFs). In the experiments, the computational times are less than 130ms, and the times are not related to the number of templates of finger spellings in database, as candidate templates are selected in extraction step.

Keywords: CAMShift, DTW, Graph Cuts, MRF.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1332356

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1590

References:


[1] R. A. Bolt, " The Integrated Multi-Modal Interface," Institute of Electronics, Information & Communication Engineers, Vol. J77-D, No. 11, pp. 2017-2025, 1987.
[2] V.I. Pavlovic, R. Sharma, T.S. Huang, " Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 677-695, 1997.
[3] B. Huet and E.R. Hancock, "Relational Histograms for Shape Indexing," Proceedings of International Conference on Computer Vision, pp. 563-569, 1998.
[4] F. Mokhtarian, S. Abbasi, and J. Kittler, "Efficient and Robust Retrieval by Shape Content Through Curvature Scale Space," Proceedings of International Workshop on Image Databases and MultiMedia Search, pp. 35-42, 1996.
[5] T. Suk and M.D. Flusser, "Combined Blur and Affine Moment Invariants and Their Use in Pattern Recognition," Pattern Recognition, Vol. 36, pp. 2895-2907, 2003.
[6] T. Starner and A. Pentland, "Visual Recognition of American Sign Language using hidden Markov Models,"Proceedings of International Workshop on Automatic Face and Gesture Recognition, pp. 189-194, 1995.
[7] J. Iivarinen and A. Visa, "Shape Recognition of Irregular Objects," Proceedings of International Conference on Intelligent Robots and Computer Vision XV, pp. 25-32, 1996.
[8] Joshua R. New, "A Method for Hand Gesture Recognition," Proceedings of ACM Chapter Fall Conference, 2002.
[9] S.L. Phung, A. Bouzerdoun, and D. Chai, "Skin Segmentation using Color Pixel Classification: Analysis and Comparision," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 1, pp. 148-154, 2005.
[10] G.R. Bradski and V. Pisarevsky, "Intel's Computer Vision Library: Application in Calibration, Stereo, Segmentation, Tracking, Gesture, Face and Object Recognition," in Proceedings of IEEE Conference of Computer Vision and Pattern Recognition, vol. 2, pp. 796-797, 2000.
[11] S.Z. Li, Markov Random Field Modeling in Computer Vision, Springer, 2001.
[12] Y. Boykov and G. Funka-Lea, "Graph Cuts and Efficient N-D Image Segmentation," International Journal of Computer Vision, vol. 70, no. 2, pp. 109-131, 2000.
[13] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother, "A Comparative Study of Energy Minimization Methods for Markov Random Fields," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, issue 6, pp. 1068-1080, 2008.
[14] Y. Boykov, O. Veksler, and R. Zabih "Fast Approximate Energy Minimization via Graph Cuts" IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, 2001.
[15] L. Ford and D. Fulkerson, Flows in Networks, Princeton University Press, 1962.