Key Frames Extraction for Sign Language Video Analysis and Recognition

Authors: Jaroslav Polec, Petra Heribanová, Tomáš Hirner


In this paper we proposed a method for finding video frames representing one sign in the finger alphabet. The method is based on determining hands location, segmentation and the use of standard video quality evaluation metrics. Metric calculation is performed only in regions of interest. Sliding mechanism for finding local extrema and adaptive threshold based on local averaging is used for key frames selection. The success rate is evaluated by recall, precision and F1 measure. The method effectiveness is compared with metrics applied to all frames. Proposed method is fast, effective and relatively easy to realize by simple input video preprocessing and subsequent use of tools designed for video quality measuring.

Keywords: video, Quality, Sign Language, MSE, metric, key frame, MSAD, SSIM, VQM, finger alphabet

