A Study of Touching Characters in Degraded Gurmukhi Text
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32797
A Study of Touching Characters in Degraded Gurmukhi Text

Authors: M. K. Jindal, G. S. Lehal, R. K. Sharma

Abstract:

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis.Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text.

Keywords: Character Segmentation, Middle Zone, Touching Characters.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1081287

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1790

References:


[1] Y. Lu, "Machine Printed Character Segmentation - an Overview", Pattern Recognition, vol. 29, no. 1, pp. 67-80, 1995
[2] S.Kahan, T.Pavlidis, and H.S.Baird, " on the recognition of printed characters of any fonts and sizes", IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 2, pp. 274-288, Mar. 1987
[3] S. Liang, M. Sridhar and M. Ahmadi, "Segmentation of Touching Characters in Printed Document Recognition," Pattern Recognition, vol. 27, no. 6, pp 825-840, June 1994
[4] G. S .Lehal and Chandan Singh, "Text segmentation of machine printed Gurmukhi script", Document Recognition and Retrieval VIII, Proceedings SPIE, USA, vol. 4307, pp. 223-231, 2001.
[5] G.S.Lehal and Chandan Singh, "A technique for segmentation of Gurmukhi script", Computer Analysis of Images and Patterns, Proceedings CAIP 2001, Warsaw, Poland, Lecture Notes in Computer Science, vol. 2127 Springer-Verlag, pp. 191-200, 2001.
[6] Veena Bansal and R.M.K. Sinha , "Segmentation of touching characters in Devanagari," in Indian Conference on Computer Vision, Graphics and Image Processing, New Delhi: pp 377-380(1998)
[7] U. Garain, B.B. Chaudhuri, "Segmentation of touching characters in printed Devanagari and Bangla scripts using fuzzy multifactorial analysis", IEEE Trans. Systems Man Cybern. Part C-32 (2002) 449- 459.
[8] U. Garain, B.B. Chaudhuri, "On recognition of touching characters in printed Bangla Documents", Proceedings of the Fourth International Conference on Document Analysis and Recognition, 1997, pp. 1011- 1016.
[9] Tao Hong, "Degraded text recognition using visual and linguistic context", a dissertation submitted to the faculty of the graduate school of the State University of New York at Buffalo, 1995.