Binarization and Recognition of Characters from Historical Degraded Documents
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 87748
Binarization and Recognition of Characters from Historical Degraded Documents

Authors: Bency Jacob, S.B. Waykar

Abstract:

Degradations in historical document images appear due to aging of the documents. It is very difficult to understand and retrieve text from badly degraded documents as there is variation between the document foreground and background. Thresholding of such document images either result in broken characters or detection of false texts. Numerous algorithms exist that can separate text and background efficiently in the textual regions of the document; but portions of background are mistaken as text in areas that hardly contain any text. This paper presents a way to overcome these problems by a robust binarization technique that recovers the text from a severely degraded document images and thereby increases the accuracy of optical character recognition systems. The proposed document recovery algorithm efficiently removes degradations from document images. Here we are using the ostus method ,local thresholding and global thresholding and after the binarization training and recognizing the characters in the degraded documents.

Keywords: binarization, denoising, global thresholding, local thresholding, thresholding

Procedia PDF Downloads 345