Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30172
Combining Color and Layout Features for the Identification of Low-resolution Documents

Authors: Ardhendu Behera, Denis Lalanne, Rolf Ingold


This paper proposes a method, combining color and layout features, for identifying documents captured from lowresolution handheld devices. On one hand, the document image color density surface is estimated and represented with an equivalent ellipse and on the other hand, the document shallow layout structure is computed and hierarchically represented. The combined color and layout features are arranged in a symbolic file, which is unique for each document and is called the document-s visual signature. Our identification method first uses the color information in the signatures in order to focus the search space on documents having a similar color distribution, and finally selects the document having the most similar layout structure in the remaining search space. Finally, our experiment considers slide documents, which are often captured using handheld devices.

Keywords: Document color modeling, document visual signature, kernel density estimation, document identification.

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1024


[1] S. Mukhopadhyay, and B. Smith, "Passive capture and structuring of lectures," in Proc. of ACM Multimedia, 1999, pp. 477-487.
[2] B. Erol, and J. Hull, "Linking presentation documents using image analysis," in Asilomar Conf. on Signals, Systems, and Computers, Nov. 9-12 2003, Pacific Grove, CA.
[3] D. Franklin, S. Bradshaw, and K. J. Hammond, "Jabberwocky: you don-t have to be a rocket scientist to change slides for hydrogen combustion lecture," Intelligent User Interface, 2000, pp. 98-105.
[4] D. Lee, B. Erol, J. Graham, J. J. Hull, and N. Murata, "Portable meeting recorder," In ACM Multimedia Conference, 2000, pp. 493-502.
[5] G. D. Abowd, "Classroom 2000: An experiment with the instrumentation of a living educational environment," IBM Systems Journal, Special issue on Pervasive Computing, vol. 38, No. 4, pp. 508- 530, 1999.
[6] P. Chiu, A. Kapuskar, and L. Wilcox, "Meeting capture in a media enriched conference room," in 2nd International Workshop on Cooperative Buildings, 1999, pp.79-88.
[7] D. Lalanne, R. Ingold, D. von Rotz, A. Behera, D. Mekhaldi and A. Popescu-Belis, ÔÇÿÔÇÿUsing static documents as structured and thematic interfaces to multimedia meeting archives," in 1st Intl. Workshop on Machine Learning for Multimodal Interaction (MLMI), 2004, Martigny, Switzerland, LNCS, vol. 3361, pp. 87-100.
[8] P. Chiu, J. Foote, A. Girgensohn, and J. Boreczky, "Automatically linking multimedia meeting documents by image matching," in Proc. of ACM Hypertext -00, 2000, pp. 244-245.
[9] N. Ozawa, H. Takebe, Y. Katsuyama, S. Naoi, and H. Yakota, "Slide identification for lecture movies by matching characters and images," in Proc. SPIE-Document Recognition and Retrieval XI, 2004, vol. 5296, pp. 74-81.
[10] J. Hu, R. Kashi, and G. Wilfong, "Document classification using layout analysis", in Proc. International Workshop on Database and Expert Systems Applications, 1999, pp. 556-560.
[11] C. Shin and D. Doermann, "Classification of document page images based on visual similarity of layout structures," in Proc. SPIE - Document Recognition and Retrieval VII, 2000, pp. 182-190.
[12] A. Dengel, and F. Dubiel, "Clustering and classification of document structure - a machine learning approach," in Proc. Second International Conf. on Document Analysis and Recognition, 1993, pp. 587-591.
[13] E. Appiani, and A.M. Colla, "Automatic analysis and indexing of variable-layout documents," in Proc. RIAO2000, Paris, France, April 12- 14, 2000, pp. 980-987.
[14] K. Y. Wong, R. G. Casey, and F. M. Wahl, "Document analysis system," IBM Journal of Research and Development, vol.26, pp. 647-656, 1982.
[15] G. Nagy, and S. Seth, "Hierarchical representation of optically scanned documents," in Proceedings of International Conference on Pattern Recognition, 1984, Vol. 1, pp. 347-349.
[16] H. S. Baird, S. E. Jones, and S. J. Fortune, "Image segmentation by shape-directed covers," in Proceedings of International Conference on Pattern Recognition, June 1990, pp. 820-825.
[17] L. O. Gorman, "The document spectrum for page layout analysis," IEEE Trans. on PAMI, vol. 15, pp. 1162-1173, 1993.
[18] K. Kise, A. Sato, and M. Iwata, "Segmentation of page images using the area voronoi diagram," Computer Vision and Image Understanding, vol. 70, pp. 370-382, 1998.
[19] F. Wahl, K. Wong, and R. Casey, "Block segmentation and text extraction in mixed text/image documents," Graphical Models and Image Processing, vol. 20, pp. 375-390, 1982.
[20] T. Pavlidis and J. Zhou, "Page segmentation and classification," CVGIP vol. 54, pp. 484-496, 1992.
[21] T. Weldon and W. Higgins, "An algorithm for designing multiple gabor filters for segmenting multi-textured images," in IEEE International Conference on Image Processing, Chicago, October, 1998, pp. 4-7.
[22] A.K. Jain, and S.K. Bhattacharjee, "Address block location on envelopes using gabor filters," Pattern Recognition, vol. 25, no.12, pp. 1459-1477, 1992.
[23] A. K. Jain, and Y. Zhong, "Page segmentation using texture analysis," Pattern Recognition, 1996, vol. 29, pp. 743-770.
[24] X.Wan, and C.C.J. Kuo, "Color distribution analysis and quantization for image retrieval," in Proceedings of SPIE, vol. 2670, February 1996.
[25] M. Stricker, M. Orengo, "Similarity of color images," in SPIE Conference on Storage and Retrieval for Image and Video Databases III, February 1995, vol. 2420, pp. 381-392.
[26] P. Aigrain, H. Zhang, and D. Petkovic, "Content-based representation and retrieval of visual media: a state-of-the-art review," Multimedia Tools and Applications, 1996, no. 3, pp. 179-202.
[27] B.S. Manjunath, W.Y. Ma, "Texture features for browsing and retrieval of image data," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 837-842, 1996.
[28] B.S. Manjunath, J.R Ohm, V.V. Vasudevan, and A. Yamada, "Color and texture descriptors," IEEE Trans. on Circuits and Systems for Video Technology, vol. 11, no. 6, pp. 703-715, 2001.
[29] A.K. Jain and A. Vailaya, "Image retrieval using color and shape," Pattern Recognition, vol. 29, no. 8, pp. 1233-1244, 1996.
[30] J.E. Gary, and R. Mehrotra, "Similar shape retrieval using a structural feature index," Information Systems, 18, 7, pp. 525-537, October 1990.
[31] M. Petkovic', "Content-based video retrieval," in 7th International Conference on Extending Database Technology, March 27-31, 2000, Konstanz, Germany, pp 74-77.
[32] M. Swain and D. Ballard, ÔÇÿÔÇÿColor indexing--, Intl. Journal of Computer Vision, vol. 7, no. 1, pp. 11-32, 1991.
[33] D. W. Scott, Multivariate Density Estimation. New York: John Wiley, 1992.
[34] B. W. Silverman, Density Estimation for Statistic and Data Analysis. New York: Chapman and Hall, 1986.
[35] E. Parzen, "On estimation of a probability density function and mode," Ann. Math. Stat., vol. 33, pp. 1065-1076, 1962.
[36] M. J. Jones and J. M. Rehag, "Statistical color models with application to skin detection,-- Intl. Journal of Computer Vision, vol. 46, no. 1, pp. 81-96, 2002.
[37] R. Cattoni, T. Coianiz, S. Messelodi, and C. M. Modena, "Geometric layout analysis techniques for document image understanding a review," Technical Report, ITC-IRST, Trento, Italy 1998.
[38] A. Behera, D. Lalanne and R. Ingold, "Visual signature based identification of low-resolution document images," ACM Symposium on Document Engineering, Milwaukee, Wisconsin, 2004, pp. 178-187.