Skew Detection Technique for Binary Document Images based on Hough Transform
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33122
Skew Detection Technique for Binary Document Images based on Hough Transform

Authors: Manjunath Aradhya V N, Hemantha Kumar G, Shivakumara P

Abstract:

Document image processing has become an increasingly important technology in the automation of office documentation tasks. During document scanning, skew is inevitably introduced into the incoming document image. Since the algorithm for layout analysis and character recognition are generally very sensitive to the page skew. Hence, skew detection and correction in document images are the critical steps before layout analysis. In this paper, a novel skew detection method is presented for binary document images. The method considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately. Several experiments have been conducted on various types of documents such as documents containing English Documents, Journals, Text-Book, Different Languages and Document with different fonts, Documents with different resolutions, to reveal the robustness of the proposed method. The experimental results revealed that the proposed method is accurate compared to the results of well-known existing methods.

Keywords: Optical Character Recognition, Skew angle, Thinning, Hough transform, Document processing

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1077860

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2101

References:


[1] Akiyama T and Hagita N, Automated entry system for printed documents, Pattern Recognition, Vol. 23, No. 11, 1990, pp 1141-1158.
[2] Baird H.S, The Skew Angle of Printed Documents, Proceedings of Conference Society of Photographic Scientists and Engineers, Rocherster, New York, 1987, pp 14-21.
[3] Cao Yang, Shuhua Wang, Li Heng., Skew detection and correction in document images based on straight-line fitting, Pattern Recognition Letters, 24, pp 1871-1879, 2003.
[4] Gonzales R.C and Woods R.E, Digital Image Processing, 2nd ed., Pearson Education Asia, 2002.
[5] Hashizume A Yeh P S and Cosenfeld A, A Method of Detecting the Orientation of Aligned Components, Pattern Recognition Letters, Vol. 4, April 1986, pp 125-132.
[6] Hou H.S., Digital Document Processing, Wisely New York, 1983.
[7] Le D S, Thoma G R and Wechsler H, Automatic page orientation and skew angle detection for binary document images. Pattern Recognition 27, 1994, pp 1325-1344.
[8] O-Gorman L, The document spectrum for page layout analysis, IEEE Transactions on Pattern Analysis and machine Intelligence, No 15, vol 11, 1993, pp. 1162-1173.
[9] Pal U and Chaudhuri B. B, An Improved document skew angle estimation technique, Pattern Recognition Letters, Vol. 17, 1996, pp 899-904.
[10] Pavlidis T and Zhou J, Page segmentation by white streams, Proceedings of first International Conference on Document Analysis and Recognition (ICDAR), France, September 30, October 2, 1991, pp 945-953.
[11] Postl W, Detection of linear oblique structures and skew scan in digitized documents. Proceedings 8th International Conference on Pattern Recognition, 1986, pp. 687-689.
[12] Postl W, Detection of linear oblique structures and skew scan in digitized documents. Proceedings 8th International Conference on Pattern Recognition, 1986, pp. 687-689.
[13] Srihari SN and Govindaraju V, Analysis of textual images using the Hough Transform, Machine Vision and Applications, vol 2, 1989, pp. 141-153.
[14] Yan, H. Skew correction of document images using interline crosscorrelation, Computer Vision, Graphics, and Image Processing 55, 1993, pp 538-543.
[15] Yu, B., Jain, A.K., A robust and fast skew detection algorithm for generic documents, Pattern Recognition 29 (10), pp 1599-1629, 1996.
[16] Yue Lu and Chew Lim Tan, A nearest neighbor chain based approach to skew estimation in document images, Pattern Recognition Letters 24, 2003, pp 2315-2323.
[17] M. Ahmed and R. Ward, (2002), Rotation Invariant Rule-Based Thinning Algorithm for Character Recognition, IEEE. Trans. Pattern Analysis and Machine Intelligence, vol. 24, No. 12, December 2002.