Key Frames Extraction for Sign Language Video Analysis and Recognition

Jaroslav Polec; Petra Heribanová; Tomáš Hirner

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33122

Key Frames Extraction for Sign Language Video Analysis and Recognition

Authors: Jaroslav Polec, Petra Heribanová, Tomáš Hirner

Abstract:

In this paper we proposed a method for finding video frames representing one sign in the finger alphabet. The method is based on determining hands location, segmentation and the use of standard video quality evaluation metrics. Metric calculation is performed only in regions of interest. Sliding mechanism for finding local extrema and adaptive threshold based on local averaging is used for key frames selection. The success rate is evaluated by recall, precision and F1 measure. The method effectiveness is compared with metrics applied to all frames. Proposed method is fast, effective and relatively easy to realize by simple input video preprocessing and subsequent use of tools designed for video quality measuring.

Keywords: Key frame, video, quality, metric, MSE, MSAD, SSIM, VQM, sign language, finger alphabet.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1057681

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2036

References:

[1] Y. Zhuang, Y. Rui, T.S. Huang, and S. Mehrotra, "Adaptive Key Frame Extraction Using Unsupervised Clustering", Roc. of Int. Conf. on Image Proc., Chicago, Oct. 1998.
[2] A. Nagasaka, and Y. Tanaka, "Automatic video indexing and full-video search for object appearances," in Second Working Conference on Visual Database Systems, 1992.
[3] D. Comaniciu, V. Ramesh, and P. Meer, "Kernel-based object tracking," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 25, pp . 564-577, May 2003.
[4] Z. Wang, A. C. Bovik, H. R. Sheikh, and Simoncelli, E.P., "Image quality assessment: from error visibility to structural similarity," IEEE Trans. Image Process., vol. 13, pp. 1-14, 2004.
[5] F. Xiao, "DCT-based Video Quality Evaluation," MSU Graphics and Media Lab (Video Group), 2000.
[6] H. Zhang, J. Wu, D. Zhong, and S. W. Smoliar, "An integrated system for content-based video retrieval and browsing," Pattern Recognition, vol. 30, no. 4, pp. 643{658, 1997.
[7] W. Wolf, "Key frame selection by motion analysis," in Proc. IEEE Int. Conf. Acoust., Speech, and Signal Proc., 1996.
[8] P. O. Gresle, and T. S. Huang, "Gisting of video documents: A key frames selection algorithm using relative activity measure," in The 2nd Int. Conf. on Visual Information Systems, 1997.
[9] T. Y. Liu, X. D. Zhang, J. Feng, and K. T. Lo, "Shot reconstruction degree: a novel criterion for key frame selection," Pattern Recogn. Lett. 0167-8655 25, 1451-1457, 2004.
[10] Y. M. Abbass, W. Fakher, and M. Rashwan, "Arabic / English Identification in a hybrid complex documents images," GVIP 05 Conference, 19-21 December 2005, CICC, Cairo, Egypt.
[11] W. S. Chau, O. C. Au, and T. S. Chong, "Key frame selection by macroblock type and motion vector analysis," in 2004 IEEE Int. Conf. on Multimedia and Expo, Vol. 1, pp. 575-578.
[12] Cumar (22.10.2001), An introduction to image compression
[Online]. Available: http://www.debugmode.com/imagecmp
[13] T. M. Liu, H. J. Zhang, and F. H. Qi, "A novel video key-frameextraction algorithm based on perceived motion energy model," IEEE Trans. Circuits Syst. Video Technol. (10), 1006-1013 2003.
[14] X. Song, and G. Fan, "Key-frame extraction for objectbased video segmentation," in IEEE Proc. Int. Conference on Acoustics, Speech and Signal Processing, 2005.
[15] M. Mentzelopoulos, and A. Psarrou, "Key-frame extraction algorithm using entropy difference," in Proceedings of the ACM SIGMM International workshop on Multimedia Information Retrieval, 2004.
[16] W. Abd-Almageed, "Online, simultaneous shot boundary detection and key frame extraction for sports videos using rank tracing," In: Proc. Image Processing, 2008. ICIP 2008.
[17] A. Hanjalic, and H. Zhang, "An integrated scheme for automated video abstraction based onunsupervised cluster-validity analysis," IEEE Trans. On Circuits And Systems For Video Tech., vol. 9, no. 8, pp. 1280-1289, 1999.
[18] M. Beniak, J. Pavlovi─ìov├í, and M. Oravec, "3D Chrominance Histogram Based Face Localization," In: Int. Journal of Signal and Imaging Systems Engineering (IJSISE). Vol. 4, No.1 pp. 3 - 12, 2011, www.inderscience.com/ijsise
[19] D. Tarcsiov├í, "Communication System of Hearing Impaired Person", Bratislava: Sapientia, p. 222, 2005.
[20] E. ┼áikudov├í, "Comparison of color spaces for face detection in digitized paintings", SCCG - Spring Conference on Computer Graphic. pp. 135 - 140, 2007.