Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5

Optical Character Recognition Related Abstracts

5 An Improved OCR Algorithm on Appearance Recognition of Electronic Components Based on Self-adaptation of Multifont Template

Authors: Tao Lin, Zhu-Qing Jia, Tong Zhou


The recognition method of Optical Character Recognition has been expensively utilized, while it is rare to be employed specifically in recognition of electronic components. This paper suggests a high-effective algorithm on appearance identification of integrated circuit components based on the existing methods of character recognition, and analyze the pros and cons.

Keywords: Optical Character Recognition, fuzzy page identification, mutual correlation matrix, confidence self-adaptation

Procedia PDF Downloads 351
4 Kannada HandWritten Character Recognition by Edge Hinge and Edge Distribution Techniques Using Manhatan and Minimum Distance Classifiers

Authors: C. V. Aravinda, H. N. Prakash


In this paper, we tried to convey fusion and state of art pertaining to SIL character recognition systems. In the first step, the text is preprocessed and normalized to perform the text identification correctly. The second step involves extracting relevant and informative features. The third step implements the classification decision. The three stages which involved are Data acquisition and preprocessing, Feature extraction, and Classification. Here we concentrated on two techniques to obtain features, Feature Extraction & Feature Selection. Edge-hinge distribution is a feature that characterizes the changes in direction of a script stroke in handwritten text. The edge-hinge distribution is extracted by means of a windowpane that is slid over an edge-detected binary handwriting image. Whenever the mid pixel of the window is on, the two edge fragments (i.e. connected sequences of pixels) emerging from this mid pixel are measured. Their directions are measured and stored as pairs. A joint probability distribution is obtained from a large sample of such pairs. Despite continuous effort, handwriting identification remains a challenging issue, due to different approaches use different varieties of features, having different. Therefore, our study will focus on handwriting recognition based on feature selection to simplify features extracting task, optimize classification system complexity, reduce running time and improve the classification accuracy.

Keywords: Optical Character Recognition, Character Recognition, word segmentation and recognition, hand written character recognition, South Indian languages

Procedia PDF Downloads 320
3 Distorted Document Images Dataset for Text Detection and Recognition

Authors: Ilia Zharikov, Philipp Nikitin, Ilia Vasiliev, Vladimir Dokholyan


With the increasing popularity of document analysis and recognition systems, text detection (TD) and optical character recognition (OCR) in document images become challenging tasks. However, according to our best knowledge, no publicly available datasets for these particular problems exist. In this paper, we introduce a Distorted Document Images dataset (DDI-100) and provide a detailed analysis of the DDI-100 in its current state. To create the dataset we collected 7000 unique document pages, and extend it by applying different types of distortions and geometric transformations. In total, DDI-100 contains more than 100,000 document images together with binary text masks, text and character locations in terms of bounding boxes. We also present an analysis of several state-of-the-art TD and OCR approaches on the presented dataset. Lastly, we demonstrate the usefulness of DDI-100 to improve accuracy and stability of the considered TD and OCR models.

Keywords: Optical Character Recognition, Document Analysis, text detection, open dataset

Procedia PDF Downloads 15
2 Non-Invasive Data Extraction from Machine Display Units Using Video Analytics

Authors: Ravneet Kaur, Joydeep Acharya, Sudhanshu Gaur


Artificial Intelligence (AI) has the potential to transform manufacturing by improving shop floor processes such as production, maintenance and quality. However, industrial datasets are notoriously difficult to extract in a real-time, streaming fashion thus, negating potential AI benefits. The main example is some specialized industrial controllers that are operated by custom software which complicates the process of connecting them to an Information Technology (IT) based data acquisition network. Security concerns may also limit direct physical access to these controllers for data acquisition. To connect the Operational Technology (OT) data stored in these controllers to an AI application in a secure, reliable and available way, we propose a novel Industrial IoT (IIoT) solution in this paper. In this solution, we demonstrate how video cameras can be installed in a factory shop floor to continuously obtain images of the controller HMIs. We propose image pre-processing to segment the HMI into regions of streaming data and regions of fixed meta-data. We then evaluate the performance of multiple Optical Character Recognition (OCR) technologies such as Tesseract and Google vision to recognize the streaming data and test it for typical factory HMIs and realistic lighting conditions. Finally, we use the meta-data to match the OCR output with the temporal, domain-dependent context of the data to improve the accuracy of the output. Our IIoT solution enables reliable and efficient data extraction which will improve the performance of subsequent AI applications.

Keywords: Internet of Things, Human Machine Interface, Optical Character Recognition, Industrial Internet of Things, video analytics

Procedia PDF Downloads 12
1 Using Optical Character Recognition to Manage the Unstructured Disaster Data into Smart Disaster Management System

Authors: Dong Seop Lee, Byung Sik Kim


In the 4th Industrial Revolution, various intelligent technologies have been developed in many fields. These artificial intelligence technologies are applied in various services, including disaster management. Disaster information management does not just support disaster work, but it is also the foundation of smart disaster management. Furthermore, it gets historical disaster information using artificial intelligence technology. Disaster information is one of important elements of entire disaster cycle. Disaster information management refers to the act of managing and processing electronic data about disaster cycle from its’ occurrence to progress, response, and plan. However, information about status control, response, recovery from natural and social disaster events, etc. is mainly managed in the structured and unstructured form of reports. Those exist as handouts or hard-copies of reports. Such unstructured form of data is often lost or destroyed due to inefficient management. It is necessary to manage unstructured data for disaster information. In this paper, the Optical Character Recognition approach is used to convert handout, hard-copies, images or reports, which is printed or generated by scanners, etc. into electronic documents. Following that, the converted disaster data is organized into the disaster code system as disaster information. Those data are stored in the disaster database system. Gathering and creating disaster information based on Optical Character Recognition for unstructured data is important element as realm of the smart disaster management. In this paper, Korean characters were improved to over 90% character recognition rate by using upgraded OCR. In the case of character recognition, the recognition rate depends on the fonts, size, and special symbols of character. We improved it through the machine learning algorithm. These converted structured data is managed in a standardized disaster information form connected with the disaster code system. The disaster code system is covered that the structured information is stored and retrieve on entire disaster cycle such as historical disaster progress, damages, response, and recovery. The expected effect of this research will be able to apply it to smart disaster management and decision making by combining artificial intelligence technologies and historical big data.

Keywords: Machine Learning, Optical Character Recognition, Unstructured data, disaster information management

Procedia PDF Downloads 1