Key Frame Based Video Summarization via Dependency Optimization
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32797
Key Frame Based Video Summarization via Dependency Optimization

Authors: Janya Sainui


As a rapid growth of digital videos and data communications, video summarization that provides a shorter version of the video for fast video browsing and retrieval is necessary. Key frame extraction is one of the mechanisms to generate video summary. In general, the extracted key frames should both represent the entire video content and contain minimum redundancy. However, most of the existing approaches heuristically select key frames; hence, the selected key frames may not be the most different frames and/or not cover the entire content of a video. In this paper, we propose a method of video summarization which provides the reasonable objective functions for selecting key frames. In particular, we apply a statistical dependency measure called quadratic mutual informaion as our objective functions for maximizing the coverage of the entire video content as well as minimizing the redundancy among selected key frames. The proposed key frame extraction algorithm finds key frames as an optimization problem. Through experiments, we demonstrate the success of the proposed video summarization approach that produces video summary with better coverage of the entire video content while less redundancy among key frames comparing to the state-of-the-art approaches.

Keywords: Video summarization, key frame extraction, dependency measure, quadratic mutual information, optimization.

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 913


[1] A. G. Money, H. Agius, “Video summarization: a conceptual framework and survey of the state of the art,” Journal of Visual Communication and Image Representation, Vol. 19, No. 2, pp. 121-143, 2008.
[2] Ajmal, Muhammad and Ashraf, Muhammad Husnain and Shakir, Muhammad and Abbas, Yasir and Shah, Faiz Ali, “Video Summarization: Techniques and Classification,” Proceedings of the 2012 International Conference on Computer Vision and Graphics, pp. 1–13, 2012.
[3] B. T. Troung, S. Venkatesh,“Video abstraction: a systematic review and classification,” ACM Transactions Multimedia Computing, Communications and Applications, Vol. 3, No. 1, 2007.
[4] M. Furini, F. Geraci, and M. Montangero, “VISTO: Visual STOryboard for web video browsing,” CIVR, pp.635-641, 2007.
[5] Z. Li, G. M. Schuster, and A. K. Katsaggelos, “MINMAX optimal video summarization,” IEEE Trans Circuits Syst. Video Technol., vol.15, no.10, pp.1245-1256, 2005.
[6] C. Panagiotakis, A. Doulamis, and G. Tziritas, “Equivalent key frames selection based on iso-content principles,” IEEE Trans. Circuits Syst. Video Technol., vol.19, no.3, pp.447-451, 2009.
[7] G. Guan, Z. Wang, S. Lu, J. D. Deng, and D. D. Feng, “Keypoints-based keyframe selection,” IEEE Trans. Circuits Syst. Video Technol., vol.23, no.4, 2013.
[8] S. E. D. Avila, A. B. P, Lopes, L. J. Antonio, and A. d. A. Araujo, “VSUMM: a mechanism designed to produce static video summaries and novel evaluation method,” Pattern Recognition Letter, vol.32 (1), pp.56-68, 2011.
[9] N. Ejaz, T. B. Tariq, and S. W. Balik, “Adaptive key frame extraction for video summarization using an aggregating mechanism,” Journal of Visual Communication and Image Representation, vol.23, pp.1031-1040, 2012.
[10] N. D. Doulamis, A. D. Doulamis, Y. Avrithis, and S. D. Kollias, “A stochastic framework for optimal frame extraction from MPEG video databases,” Comput. Visi. Image Understand., vol.75, no.1-2, pp.3-24, 1999.
[11] A. Nagasaka and Y. Tanaka, “Automatic video indexing and full-video search for object appearances,” in Visual Database Systems II, 1992.
[12] Y. Zhuang, Y. Rui, T. Huang, and S. Mehrotra, “Adaptive key frame extraction using unsupervised clustering,” in Proc. IEEE Int. Image Process., pp.866-870, 1998.
[13] P. Mundur, Y. Rao, and Y. Yesha, “Keyframe-based video summarization using Delaunay clustering,” International Journal on Digital Libraries (IJDL) 6(2), pp.219-232, 2006.
[14] M. Furini, F. Geraci, M. Montangero, and M. Pellegrini, “STIMO: STIll and MOving video storyboard for the web scenario,” Multimedia Tools and Applications, vol.46, no.1, pp.47-69, 2010.
[15] J. Almeida, N. J. Leite, and Ricardo da S. Torres, “VISON: VIdeo Summarization for ONline applications,” Pattern Recogn. Lett. 33, 4 (March 2012), pp. 397-409, 2012.
[16] Z. Zhao and A. Elgammal, “Information theoretic key frame selection for action recognition,” in Proc. Of British machine vision, pp.1-10, 2008.
[17] T. Liu, H. J. Zhang, and F. Qi, “A novel video key-frame-extraction algorithm based on perceived motion energy model,” IEEE Trans. Circuits Syst. Video Technol., vol.13, no.10, pp.1006-1013, 2013.
[18] B. Fauvet, P. Bouthemy, P. Gros, and F. Spindler, “A geometrical key-frame selection method exploiting dominant motion estimation in video,” in Proc. CIVR, 2004.
[19] W. Wolf, “Key frame selection by motion analysis,” in Proc. IEEE Int. Conf. Acoust., Speech, and Signal Proc., 1996.
[20] W. Barhoumi and E. Zagrouba, “On-the-fly extraction of key frames for efficient video summarization,” AASRI Conference on Intelligent Systems and Control, vol.4, pp.78-84, 2013.
[21] N. Ejaz, I. Mehmood, and S. W. Baik, “Efficient visual attention based framework for extracting key frames from videos,” Signal Processing: Image Communication, vol.28(1), pp.34-44, 2013.
[22] H. Zhang, J.Wu, D. Zhong, and S. W. Smoliar, “An integrated system for content-based video retrieval and browsing,” Pattern Recognition, vol.30, no.4, pp.643-658, 1997.
[23] K. Torkkoa, “Feature extraction by non-parametric mutual information,” J. Machine Learning Research, vol.3, pp.1415-1438, 2003.
[24] J. Sainui and M. Sugiyama, “Direct approximation of quadratic mutual information and its application to dependence-maximization clustering,” IEICE Trans. Inf. & Syst., vol.E96-D, no.10, pp.2282-2285, 2013.
[25] J. Sainui and M. Sugiyama, “Minimum dependency key frames selection via quadratic mutual information ”, The 10th International Conference on Digital Information Management (ICDIM2015), pp. 148-153, 2015.
[26] M.J. Swain and D.H. Ballard, “Color indexing”, International Journal of Computer Vision, 7 (11), pp. 11-32, 1991.
[27] The Open Video Project (Online). Available: (Accessed on 27/10/2015).
[28] Video SUMMarization (Online). Available: (Accessed on 27/10/2015).
[29] (Online). Available: jurandy/vison/VISON (Accessed on 16/05/2016).
[30] (Online). Available:∼jurandy/summaries (Accessed on 16/05/2016).