Search results for: automated document processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4921

Search results for: automated document processing

4891 Video Processing of a Football Game: Detecting Features of a Football Match for Automated Calculation of Statistics

Authors: Rishabh Beri, Sahil Shah

Abstract:

We have applied a range of filters and processing in order to extract out the various features of the football game, like the field lines of a football field. Another important aspect was the detection of the players in the field and tagging them according to their teams distinguished by their jersey colours. This extracted information combined about the players and field helped us to create a virtual field that consists of the playing field and the players mapped to their locations in it.

Keywords: Detect, Football, Players, Virtual

Procedia PDF Downloads 304
4890 Application of Signature Verification Models for Document Recognition

Authors: Boris M. Fedorov, Liudmila P. Goncharenko, Sergey A. Sybachin, Natalia A. Mamedova, Ekaterina V. Makarenkova, Saule Rakhimova

Abstract:

In modern economic conditions, the question of the possibility of correct recognition of a signature on digital documents in order to verify the expression of will or confirm a certain operation is relevant. The additional complexity of processing lies in the dynamic variability of the signature for each individual, as well as in the way information is processed because the signature refers to biometric data. The article discusses the issues of using artificial intelligence models in order to improve the quality of signature confirmation in document recognition. The analysis of several possible options for using the model is carried out. The results of the study are given, in which it is possible to correctly determine the authenticity of the signature on small samples.

Keywords: signature recognition, biometric data, artificial intelligence, neural networks

Procedia PDF Downloads 119
4889 DesignChain: Automated Design of Products Featuring a Large Number of Variants

Authors: Lars Rödel, Jonas Krebs, Gregor Müller

Abstract:

The growing price pressure due to the increasing number of global suppliers, the growing individualization of products and ever-shorter delivery times are upcoming challenges in the industry. In this context, Mass Personalization stands for the individualized production of customer products in batch size 1 at the price of standardized products. The possibilities of digitalization and automation of technical order processing open up the opportunity for companies to significantly reduce their cost of complexity and lead times and thus enhance their competitiveness. Many companies already use a range of CAx tools and configuration solutions today. Often, the expert knowledge of employees is hidden in "knowledge silos" and is rarely networked across processes. DesignChain describes the automated digital process from the recording of individual customer requirements, through design and technical preparation, to production. Configurators offer the possibility of mapping variant-rich products within the Design Chain. This transformation of customer requirements into product features makes it possible to generate even complex CAD models, such as those for large-scale plants, on a rule-based basis. With the aid of an automated CAx chain, production-relevant documents are thus transferred digitally to production. This process, which can be fully automated, allows variants to always be generated on the basis of current version statuses.

Keywords: automation, design, CAD, CAx

Procedia PDF Downloads 51
4888 A Simplified Model of the Control System with PFM

Authors: Bekmurza H. Aitchanov, Sholpan K. Aitchanova, Olimzhon A. Baimuratov, Aitkul N. Aldibekova

Abstract:

This work considers the automated control system (ACS) of milk quality during its magnetic field processing. For achieving high level of quality control methods were applied transformation of complex nonlinear systems in a linearized system with a less complex structure. Presented ACS is adjustable by seven parameters: mass fraction of fat, mass fraction of dry skim milk residues (DSMR), density, mass fraction of added water, temperature, mass fraction of protein, acidity.

Keywords: fluids magnetization, nuclear magnetic resonance, automated control system, dynamic pulse-frequency modulator, PFM, nonlinear systems, structural model

Procedia PDF Downloads 350
4887 Analysis of Operation System Reorganization for Load Balancing of Parcel Sorting

Authors: J. H. Lee

Abstract:

As the internet and smartphone use increases, the E-Commerce is constantly growing. Therefore, the parcel is increasing continuously every year. If the larger amount than the processing capacity of the current facilities is received, they do not process, and the delivery quality becomes low. In this paper, therefore, we analyze comparatively at the cost perspective between the case of building a new facility for the increasing parcel volumes and the case of reorganizing the current operating system. We propose the optimal discount policy per parcel by calculating the construction cost of new automated facility and manual facilities until the construction of the new automated facility, and discount price.

Keywords: system reorganization, load balancing, parcel sorting, discount policy

Procedia PDF Downloads 239
4886 Knowledge Based Automated Software Engineering Platform Used for the Development of Bulgarian E-Customs

Authors: Ivan Stanev, Maria Koleva

Abstract:

Described are challenges to the Bulgarian e-Customs (BeC) related to low level of interoperability and standardization, inefficient use of available infrastructure, lack of centralized identification and authorization, extremely low level of software process automation, and insufficient quality of data stored in official registers. The technical requirements for BeC are prepared with a focus on domain independent common platform, specialized customs and excise components, high scalability, flexibility, and reusability. The Knowledge Based Automated Software Engineering (KBASE) Common Platform for Automated Programming (CPAP) is selected as an instrument covering BeC requirements for standardization, programming automation, knowledge interpretation and cloud computing. BeC stage 3 results are presented and analyzed. BeC.S3 development trends are identified.

Keywords: service oriented architecture, cloud computing, knowledge based automated software engineering, common platform for automated programming, e-customs

Procedia PDF Downloads 338
4885 Design and Field Programmable Gate Array Implementation of Radio Frequency Identification for Boosting up Tag Data Processing

Authors: G. Rajeshwari, V. D. M. Jabez Daniel

Abstract:

Radio Frequency Identification systems are used for automated identification in various applications such as automobiles, health care and security. It is also called as the automated data collection technology. RFID readers are placed in any area to scan large number of tags to cover a wide distance. The placement of the RFID elements may result in several types of collisions. A major challenge in RFID system is collision avoidance. In the previous works the collision was avoided by using algorithms such as ALOHA and tree algorithm. This work proposes collision reduction and increased throughput through reading enhancement method with tree algorithm. The reading enhancement is done by improving interrogation procedure and increasing the data handling capacity of RFID reader with parallel processing. The work is simulated using Xilinx ISE 14.5 verilog language. By implementing this in the RFID system, we can able to achieve high throughput and avoid collision in the reader at a same instant of time. The overall system efficiency will be increased by implementing this.

Keywords: antenna, anti-collision protocols, data management system, reader, reading enhancement, tag

Procedia PDF Downloads 268
4884 Automated Java Testing: JUnit versus AspectJ

Authors: Manish Jain, Dinesh Gopalani

Abstract:

Growing dependency of mankind on software technology increases the need for thorough testing of the software applications and automated testing techniques that support testing activities. We have outlined our testing strategy for performing various types of automated testing of Java applications using AspectJ which has become the de-facto standard for Aspect Oriented Programming (AOP). Likewise JUnit, a unit testing framework is the most popular Java testing tool. In this paper, we have evaluated our proposed AOP approach for automated testing and JUnit on various parameters. First we have provided the similarity between the two approaches and then we have done a detailed comparison of the two testing techniques on factors like lines of testing code, learning curve, testing of private members etc. We established that our AOP testing approach using AspectJ has got several advantages and is thus particularly more effective than JUnit.

Keywords: aspect oriented programming, AspectJ, aspects, JU-nit, software testing

Procedia PDF Downloads 299
4883 Airport Pavement Crack Measurement Systems and Crack Density for Pavement Evaluation

Authors: Ali Ashtiani, Hamid Shirazi

Abstract:

This paper reviews the status of existing practice and research related to measuring pavement cracking and using crack density as a pavement surface evaluation protocol. Crack density for pavement evaluation is currently not widely used within the airport community and its use by the highway community is limited. However, surface cracking is a distress that is closely monitored by airport staff and significantly influences the development of maintenance, rehabilitation and reconstruction plans for airport pavements. Therefore crack density has the potential to become an important indicator of pavement condition if the type, severity and extent of surface cracking can be accurately measured. A pavement distress survey is an essential component of any pavement assessment. Manual crack surveying has been widely used for decades to measure pavement performance. However, the accuracy and precision of manual surveys can vary depending upon the surveyor and performing surveys may disrupt normal operations. Given the variability of manual surveys, this method has shown inconsistencies in distress classification and measurement. This can potentially impact the planning for pavement maintenance, rehabilitation and reconstruction and the associated funding strategies. A substantial effort has been devoted for the past 20 years to reduce the human intervention and the error associated with it by moving toward automated distress collection methods. The automated methods refer to the systems that identify, classify and quantify pavement distresses through processes that require no or very minimal human intervention. This principally involves the use of a digital recognition software to analyze and characterize pavement distresses. The lack of established protocols for measurement and classification of pavement cracks captured using digital images is a challenge to developing a reliable automated system for distress assessment. Variations in types and severity of distresses, different pavement surface textures and colors and presence of pavement joints and edges all complicate automated image processing and crack measurement and classification. This paper summarizes the commercially available systems and technologies for automated pavement distress evaluation. A comprehensive automated pavement distress survey involves collection, interpretation, and processing of the surface images to identify the type, quantity and severity of the surface distresses. The outputs can be used to quantitatively calculate the crack density. The systems for automated distress survey using digital images reviewed in this paper can assist the airport industry in the development of a pavement evaluation protocol based on crack density. Analysis of automated distress survey data can lead to a crack density index. This index can be used as a means of assessing pavement condition and to predict pavement performance. This can be used by airport owners to determine the type of pavement maintenance and rehabilitation in a more consistent way.

Keywords: airport pavement management, crack density, pavement evaluation, pavement management

Procedia PDF Downloads 167
4882 A Proposed Approach for Emotion Lexicon Enrichment

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.

Keywords: document analysis, sentimental analysis, emotion detection, WEKA tool, NRC lexicon

Procedia PDF Downloads 400
4881 Business and Psychological Principles Integrated into Automated Capital Investment Systems through Mathematical Algorithms

Authors: Cristian Pauna

Abstract:

With few steps away from the 2020, investments in financial markets is a common activity nowadays. In the electronic trading environment, the automated investment software has become a major part in the business intelligence system of any modern financial company. The investment decisions are assisted and/or made automatically by computers using mathematical algorithms today. The complexity of these algorithms requires computer assistance in the investment process. This paper will present several investment strategies that can be automated with algorithmic trading for Deutscher Aktienindex DAX30. It was found that, based on several price action mathematical models used for high-frequency trading some investment strategies can be optimized and improved for automated investments with good results. This paper will present the way to automate these investment decisions. Automated signals will be built using all of these strategies. Three major types of investment strategies were found in this study. The types are separated by the target length and by the exit strategy used. The exit decisions will be also automated and the paper will present the specificity for each investment type. A comparative study will be also included in this paper in order to reveal the differences between strategies. Based on these results, the profit and the capital exposure will be compared and analyzed in order to qualify the investment methodologies presented and to compare them with any other investment system. As conclusion, some major investment strategies will be revealed and compared in order to be considered for inclusion in any automated investment system.

Keywords: Algorithmic trading, automated investment systems, limit conditions, trading principles, trading strategies

Procedia PDF Downloads 164
4880 Resume Ranking Using Custom Word2vec and Rule-Based Natural Language Processing Techniques

Authors: Subodh Chandra Shakya, Rajendra Sapkota, Aakash Tamang, Shushant Pudasaini, Sujan Adhikari, Sajjan Adhikari

Abstract:

Lots of efforts have been made in order to measure the semantic similarity between the text corpora in the documents. Techniques have been evolved to measure the similarity of two documents. One such state-of-art technique in the field of Natural Language Processing (NLP) is word to vector models, which converts the words into their word-embedding and measures the similarity between the vectors. We found this to be quite useful for the task of resume ranking. So, this research paper is the implementation of the word2vec model along with other Natural Language Processing techniques in order to rank the resumes for the particular job description so as to automate the process of hiring. The research paper proposes the system and the findings that were made during the process of building the system.

Keywords: chunking, document similarity, information extraction, natural language processing, word2vec, word embedding

Procedia PDF Downloads 131
4879 Enhancing the Recruitment Process through Machine Learning: An Automated CV Screening System

Authors: Kaoutar Ben Azzou, Hanaa Talei

Abstract:

Human resources is an important department in each organization as it manages the life cycle of employees from recruitment training to retirement or termination of contracts. The recruitment process starts with a job opening, followed by a selection of the best-fit candidates from all applicants. Matching the best profile for a job position requires a manual way of looking at many CVs, which requires hours of work that can sometimes lead to choosing not the best profile. The work presented in this paper aims at reducing the workload of HR personnel by automating the preliminary stages of the candidate screening process, thereby fostering a more streamlined recruitment workflow. This tool introduces an automated system designed to help with the recruitment process by scanning candidates' CVs, extracting pertinent features, and employing machine learning algorithms to decide the most fitting job profile for each candidate. Our work employs natural language processing (NLP) techniques to identify and extract key features from unstructured text extracted from a CV, such as education, work experience, and skills. Subsequently, the system utilizes these features to match candidates with job profiles, leveraging the power of classification algorithms.

Keywords: automated recruitment, candidate screening, machine learning, human resources management

Procedia PDF Downloads 29
4878 Clinical Validation of an Automated Natural Language Processing Algorithm for Finding COVID-19 Symptoms and Complications in Patient Notes

Authors: Karolina Wieczorek, Sophie Wiliams

Abstract:

Introduction: Patient data is often collected in Electronic Health Record Systems (EHR) for purposes such as providing care as well as reporting data. This information can be re-used to validate data models in clinical trials or in epidemiological studies. Manual validation of automated tools is vital to pick up errors in processing and to provide confidence in the output. Mentioning a disease in a discharge letter does not necessarily mean that a patient suffers from this disease. Many of them discuss a diagnostic process, different tests, or discuss whether a patient has a certain disease. The COVID-19 dataset in this study used natural language processing (NLP), an automated algorithm which extracts information related to COVID-19 symptoms, complications, and medications prescribed within the hospital. Free-text patient clinical patient notes are rich sources of information which contain patient data not captured in a structured form, hence the use of named entity recognition (NER) to capture additional information. Methods: Patient data (discharge summary letters) were exported and screened by an algorithm to pick up relevant terms related to COVID-19. Manual validation of automated tools is vital to pick up errors in processing and to provide confidence in the output. A list of 124 Systematized Nomenclature of Medicine (SNOMED) Clinical Terms has been provided in Excel with corresponding IDs. Two independent medical student researchers were provided with a dictionary of SNOMED list of terms to refer to when screening the notes. They worked on two separate datasets called "A” and "B”, respectively. Notes were screened to check if the correct term had been picked-up by the algorithm to ensure that negated terms were not picked up. Results: Its implementation in the hospital began on March 31, 2020, and the first EHR-derived extract was generated for use in an audit study on June 04, 2020. The dataset has contributed to large, priority clinical trials (including International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC) by bulk upload to REDcap research databases) and local research and audit studies. Successful sharing of EHR-extracted datasets requires communicating the provenance and quality, including completeness and accuracy of this data. The results of the validation of the algorithm were the following: precision (0.907), recall (0.416), and F-score test (0.570). Percentage enhancement with NLP extracted terms compared to regular data extraction alone was low (0.3%) for relatively well-documented data such as previous medical history but higher (16.6%, 29.53%, 30.3%, 45.1%) for complications, presenting illness, chronic procedures, acute procedures respectively. Conclusions: This automated NLP algorithm is shown to be useful in facilitating patient data analysis and has the potential to be used in more large-scale clinical trials to assess potential study exclusion criteria for participants in the development of vaccines.

Keywords: automated, algorithm, NLP, COVID-19

Procedia PDF Downloads 71
4877 Improving the Performance of Requisition Document Online System for Royal Thai Army by Using Time Series Model

Authors: D. Prangchumpol

Abstract:

This research presents a forecasting method of requisition document demands for Military units by using Exponential Smoothing methods to analyze data. The data used in the forecast is an actual data requisition document of The Adjutant General Department. The results of the forecasting model to forecast the requisition of the document found that Holt–Winters’ trend and seasonality method of α=0.1, β=0, γ=0 is appropriate and matches for requisition of documents. In addition, the researcher has developed a requisition online system to improve the performance of requisition documents of The Adjutant General Department, and also ensuring that the operation can be checked.

Keywords: requisition, holt–winters, time series, royal thai army

Procedia PDF Downloads 283
4876 Automatic Music Score Recognition System Using Digital Image Processing

Authors: Yuan-Hsiang Chang, Zhong-Xian Peng, Li-Der Jeng

Abstract:

Music has always been an integral part of human’s daily lives. But, for the most people, reading musical score and turning it into melody is not easy. This study aims to develop an Automatic music score recognition system using digital image processing, which can be used to read and analyze musical score images automatically. The technical approaches included: (1) staff region segmentation; (2) image preprocessing; (3) note recognition; and (4) accidental and rest recognition. Digital image processing techniques (e.g., horizontal /vertical projections, connected component labeling, morphological processing, template matching, etc.) were applied according to musical notes, accidents, and rests in staff notations. Preliminary results showed that our system could achieve detection and recognition rates of 96.3% and 91.7%, respectively. In conclusion, we presented an effective automated musical score recognition system that could be integrated in a system with a media player to play music/songs given input images of musical score. Ultimately, this system could also be incorporated in applications for mobile devices as a learning tool, such that a music player could learn to play music/songs.

Keywords: connected component labeling, image processing, morphological processing, optical musical recognition

Procedia PDF Downloads 392
4875 Hindi Speech Synthesis by Concatenation of Recognized Hand Written Devnagri Script Using Support Vector Machines Classifier

Authors: Saurabh Farkya, Govinda Surampudi

Abstract:

Optical Character Recognition is one of the current major research areas. This paper is focussed on recognition of Devanagari script and its sound generation. This Paper consists of two parts. First, Optical Character Recognition of Devnagari handwritten Script. Second, speech synthesis of the recognized text. This paper shows an implementation of support vector machines for the purpose of Devnagari Script recognition. The Support Vector Machines was trained with Multi Domain features; Transform Domain and Spatial Domain or Structural Domain feature. Transform Domain includes the wavelet feature of the character. Structural Domain consists of Distance Profile feature and Gradient feature. The Segmentation of the text document has been done in 3 levels-Line Segmentation, Word Segmentation, and Character Segmentation. The pre-processing of the characters has been done with the help of various Morphological operations-Otsu's Algorithm, Erosion, Dilation, Filtration and Thinning techniques. The Algorithm was tested on the self-prepared database, a collection of various handwriting. Further, Unicode was used to convert recognized Devnagari text into understandable computer document. The document so obtained is an array of codes which was used to generate digitized text and to synthesize Hindi speech. Phonemes from the self-prepared database were used to generate the speech of the scanned document using concatenation technique.

Keywords: Character Recognition (OCR), Text to Speech (TTS), Support Vector Machines (SVM), Library of Support Vector Machines (LIBSVM)

Procedia PDF Downloads 470
4874 A Newspapers Expectations Indicator from Web Scraping

Authors: Pilar Rey del Castillo

Abstract:

This document describes the building of an average indicator of the general sentiments about the future exposed in the newspapers in Spain. The raw data are collected through the scraping of the Digital Periodical and Newspaper Library website. Basic tools of natural language processing are later applied to the collected information to evaluate the sentiment strength of each word in the texts using a polarized dictionary. The last step consists of summarizing these sentiments to produce daily indices. The results are a first insight into the applicability of these techniques to produce periodic sentiment indicators.

Keywords: natural language processing, periodic indicator, sentiment analysis, web scraping

Procedia PDF Downloads 107
4873 Towards Automated Remanufacturing of Marine and Offshore Engineering Components

Authors: Aprilia, Wei Liang Keith Nguyen, Shu Beng Tor, Gerald Gim Lee Seet, Chee Kai Chua

Abstract:

Automated remanufacturing process is of great interest in today’s marine and offshore industry. Most of the current remanufacturing processes are carried out manually and hence they are error prone, labour-intensive and costly. In this paper, a conceptual framework for automated remanufacturing is presented. This framework involves the integration of 3D non-contact digitization, adaptive surface reconstruction, additive manufacturing and machining operation. Each operation is operated and interconnected automatically as one system. The feasibility of adaptive surface reconstruction on marine and offshore engineering components is also discussed. Several engineering components were evaluated and the results showed that this proposed system is feasible. Conclusions are drawn and further research work is discussed.

Keywords: adaptive surface reconstruction, automated remanufacturing, automatic repair, reverse engineering

Procedia PDF Downloads 302
4872 Off-Topic Text Detection System Using a Hybrid Model

Authors: Usama Shahid

Abstract:

Be it written documents, news columns, or students' essays, verifying the content can be a time-consuming task. Apart from the spelling and grammar mistakes, the proofreader is also supposed to verify whether the content included in the essay or document is relevant or not. The irrelevant content in any document or essay is referred to as off-topic text and in this paper, we will address the problem of off-topic text detection from a document using machine learning techniques. Our study aims to identify the off-topic content from a document using Echo state network model and we will also compare data with other models. The previous study uses Convolutional Neural Networks and TFIDF to detect off-topic text. We will rearrange the existing datasets and take new classifiers along with new word embeddings and implement them on existing and new datasets in order to compare the results with the previously existing CNN model.

Keywords: off topic, text detection, eco state network, machine learning

Procedia PDF Downloads 57
4871 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Su-Hyeon Jeon, ByeoungKug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we previously proposed a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. In this paper, we design a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: big data analysis, document classification, multi-category, text mining, topic analysis

Procedia PDF Downloads 246
4870 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu

Authors: Ammarah Irum, Muhammad Ali Tahir

Abstract:

Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.

Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language

Procedia PDF Downloads 41
4869 Neural Graph Matching for Modification Similarity Applied to Electronic Document Comparison

Authors: Po-Fang Hsu, Chiching Wei

Abstract:

In this paper, we present a novel neural graph matching approach applied to document comparison. Document comparison is a common task in the legal and financial industries. In some cases, the most important differences may be the addition or omission of words, sentences, clauses, or paragraphs. However, it is a challenging task without recording or tracing the whole edited process. Under many temporal uncertainties, we explore the potentiality of our approach to proximate the accurate comparison to make sure which element blocks have a relation of edition with others. In the beginning, we apply a document layout analysis that combines traditional and modern technics to segment layouts in blocks of various types appropriately. Then we transform this issue into a problem of layout graph matching with textual awareness. Regarding graph matching, it is a long-studied problem with a broad range of applications. However, different from previous works focusing on visual images or structural layout, we also bring textual features into our model for adapting this domain. Specifically, based on the electronic document, we introduce an encoder to deal with the visual presentation decoding from PDF. Additionally, because the modifications can cause the inconsistency of document layout analysis between modified documents and the blocks can be merged and split, Sinkhorn divergence is adopted in our neural graph approach, which tries to overcome both these issues with many-to-many block matching. We demonstrate this on two categories of layouts, as follows., legal agreement and scientific articles, collected from our real-case datasets.

Keywords: document comparison, graph matching, graph neural network, modification similarity, multi-modal

Procedia PDF Downloads 153
4868 Improvement of Microscopic Detection of Acid-Fast Bacilli for Tuberculosis by Artificial Intelligence-Assisted Microscopic Platform and Medical Image Recognition System

Authors: Hsiao-Chuan Huang, King-Lung Kuo, Mei-Hsin Lo, Hsiao-Yun Chou, Yusen Lin

Abstract:

The most robust and economical method for laboratory diagnosis of TB is to identify mycobacterial bacilli (AFB) under acid-fast staining despite its disadvantages of low sensitivity and labor-intensive. Though digital pathology becomes popular in medicine, an automated microscopic system for microbiology is still not available. A new AI-assisted automated microscopic system, consisting of a microscopic scanner and recognition program powered by big data and deep learning, may significantly increase the sensitivity of TB smear microscopy. Thus, the objective is to evaluate such an automatic system for the identification of AFB. A total of 5,930 smears was enrolled for this study. An intelligent microscope system (TB-Scan, Wellgen Medical, Taiwan) was used for microscopic image scanning and AFB detection. 272 AFB smears were used for transfer learning to increase the accuracy. Referee medical technicians were used as Gold Standard for result discrepancy. Results showed that, under a total of 1726 AFB smears, the automated system's accuracy, sensitivity and specificity were 95.6% (1,650/1,726), 87.7% (57/65), and 95.9% (1,593/1,661), respectively. Compared to culture, the sensitivity for human technicians was only 33.8% (38/142); however, the automated system can achieve 74.6% (106/142), which is significantly higher than human technicians, and this is the first of such an automated microscope system for TB smear testing in a controlled trial. This automated system could achieve higher TB smear sensitivity and laboratory efficiency and may complement molecular methods (eg. GeneXpert) to reduce the total cost for TB control. Furthermore, such an automated system is capable of remote access by the internet and can be deployed in the area with limited medical resources.

Keywords: TB smears, automated microscope, artificial intelligence, medical imaging

Procedia PDF Downloads 193
4867 Use of Interpretable Evolved Search Query Classifiers for Sinhala Documents

Authors: Prasanna Haddela

Abstract:

Document analysis is a well matured yet still active research field, partly as a result of the intricate nature of building computational tools but also due to the inherent problems arising from the variety and complexity of human languages. Breaking down language barriers is vital in enabling access to a number of recent technologies. This paper investigates the application of document classification methods to new Sinhalese datasets. This language is geographically isolated and rich with many of its own unique features. We will examine the interpretability of the classification models with a particular focus on the use of evolved Lucene search queries generated using a Genetic Algorithm (GA) as a method of document classification. We will compare the accuracy and interpretability of these search queries with other popular classifiers. The results are promising and are roughly in line with previous work on English language datasets.

Keywords: evolved search queries, Sinhala document classification, Lucene Sinhala analyzer, interpretable text classification, genetic algorithm

Procedia PDF Downloads 91
4866 Contribution of Automated Early Warning Score Usage to Patient Safety

Authors: Phang Moon Leng

Abstract:

Automated Early Warning Scores is a newly developed clinical decision tool that is used to streamline and improve the process of obtaining a patient’s vital signs so a clinical decision can be made at an earlier stage to prevent the patient from further deterioration. This technology provides immediate update on the score and clinical decision to be taken based on the outcome. This paper aims to study the use of an automated early warning score system on whether the technology has assisted the hospital in early detection and escalation of clinical condition and improve patient outcome. The hospital adopted the Modified Early Warning Scores (MEWS) Scoring System and MEWS Clinical Response into Philips IntelliVue Guardian Automated Early Warning Score equipment and studied whether the process has been leaned, whether the use of technology improved the usage & experience of the nurses, and whether the technology has improved patient care and outcome. It was found the steps required to obtain vital signs has been significantly reduced and is used more frequently to obtain patient vital signs. The number of deaths, and length of stay has significantly decreased as clinical decisions can be made and escalated more quickly with the Automated EWS. The automated early warning score equipment has helped improve work efficiency by removing the need for documenting into patient’s EMR. The technology streamlines clinical decision-making and allows faster care and intervention to be carried out and improves overall patient outcome which translates to better care for patient.

Keywords: automated early warning score, clinical quality and safety, patient safety, medical technology

Procedia PDF Downloads 157
4865 Fully Automated Methods for the Detection and Segmentation of Mitochondria in Microscopy Images

Authors: Blessing Ojeme, Frederick Quinn, Russell Karls, Shannon Quinn

Abstract:

The detection and segmentation of mitochondria from fluorescence microscopy are crucial for understanding the complex structure of the nervous system. However, the constant fission and fusion of mitochondria and image distortion in the background make the task of detection and segmentation challenging. In the literature, a number of open-source software tools and artificial intelligence (AI) methods have been described for analyzing mitochondrial images, achieving remarkable classification and quantitation results. However, the availability of combined expertise in the medical field and AI required to utilize these tools poses a challenge to its full adoption and use in clinical settings. Motivated by the advantages of automated methods in terms of good performance, minimum detection time, ease of implementation, and cross-platform compatibility, this study proposes a fully automated framework for the detection and segmentation of mitochondria using both image shape information and descriptive statistics. Using the low-cost, open-source python and openCV library, the algorithms are implemented in three stages: pre-processing, image binarization, and coarse-to-fine segmentation. The proposed model is validated using the mitochondrial fluorescence dataset. Ground truth labels generated using a Lab kit were also used to evaluate the performance of our detection and segmentation model. The study produces good detection and segmentation results and reports the challenges encountered during the image analysis of mitochondrial morphology from the fluorescence mitochondrial dataset. A discussion on the methods and future perspectives of fully automated frameworks conclude the paper.

Keywords: 2D, binarization, CLAHE, detection, fluorescence microscopy, mitochondria, segmentation

Procedia PDF Downloads 336
4864 DCDNet: Lightweight Document Corner Detection Network Based on Attention Mechanism

Authors: Kun Xu, Yuan Xu, Jia Qiao

Abstract:

The document detection plays an important role in optical character recognition and text analysis. Because the traditional detection methods have weak generalization ability, and deep neural network has complex structure and large number of parameters, which cannot be well applied in mobile devices, this paper proposes a lightweight Document Corner Detection Network (DCDNet). DCDNet is a two-stage architecture. The first stage with Encoder-Decoder structure adopts depthwise separable convolution to greatly reduce the network parameters. After introducing the Feature Attention Union (FAU) module, the second stage enhances the feature information of spatial and channel dim and adaptively adjusts the size of receptive field to enhance the feature expression ability of the model. Aiming at solving the problem of the large difference in the number of pixel distribution between corner and non-corner, Weighted Binary Cross Entropy Loss (WBCE Loss) is proposed to define corner detection problem as a classification problem to make the training process more efficient. In order to make up for the lack of Dataset of document corner detection, a Dataset containing 6620 images named Document Corner Detection Dataset (DCDD) is made. Experimental results show that the proposed method can obtain fast, stable and accurate detection results on DCDD.

Keywords: document detection, corner detection, attention mechanism, lightweight

Procedia PDF Downloads 329
4863 Automated Driving Deep Neural Networks Model Accuracy and Performance Assessment in a Simulated Environment

Authors: David Tena-Gago, Jose M. Alcaraz Calero, Qi Wang

Abstract:

The evolution and integration of automated vehicles have become more and more tangible in recent years. State-of-the-art technological advances in the field of camera-based Artificial Intelligence (AI) and computer vision greatly favor the performance and reliability of the Advanced Driver Assistance System (ADAS), leading to a greater knowledge of vehicular operation and resembling human behavior. However, the exclusive use of this technology still seems insufficient to control vehicular operation at 100%. To reveal the degree of accuracy of the current camera-based automated driving AI modules, this paper studies the structure and behavior of one of the main solutions in a controlled testing environment. The results obtained clearly outline the lack of reliability when using exclusively the AI model in the perception stage, thereby entailing using additional complementary sensors to improve its safety and performance.

Keywords: accuracy assessment, AI-driven mobility, artificial intelligence, automated vehicles

Procedia PDF Downloads 85
4862 Defect Correlation of Computed Tomography and Serial Sectioning in Additively Manufactured Ti-6Al-4V

Authors: Bryce R. Jolley, Michael Uchic

Abstract:

This study presents initial results toward the correlative characterization of inherent defects of Ti-6Al-4V additive manufacture (AM). X-Ray Computed Tomography (CT) defect data are compared and correlated with microscopic photographs obtained via automated serial sectioning. The metal AM specimen was manufactured out of Ti-6Al-4V virgin powder to specified dimensions. A post-contour was applied during the fabrication process with a speed of 1050 mm/s, power of 260 W, and a width of 140 µm. The specimen was stress relief heat-treated at 16°F for 3 hours. Microfocus CT imaging was accomplished on the specimen within a predetermined region of the build. Microfocus CT imaging was conducted with parameters optimized for Ti-6Al-4V additive manufacture. After CT imaging, a modified RoboMet. 3D version 2 was employed for serial sectioning and optical microscopy characterization of the same predetermined region. Automated montage capture with sub-micron resolution, bright-field reflection, 12-bit monochrome optical images were performed in an automated fashion. These optical images were post-processed to produce 2D and 3D data sets. This processing included thresholding and segmentation to improve visualization of defect features. The defects observed from optical imaging were compared and correlated with the defects observed from CT imaging over the same predetermined region of the specimen. Quantitative results of area fraction and equivalent pore diameters obtained via each method are presented for this correlation. It is shown that Microfocus CT imaging does not capture all inherent defects within this Ti-6Al-4V AM sample. Best practices for this correlative effort are also presented as well as the future direction of research resultant from this current study.

Keywords: additive manufacture, automated serial sectioning, computed tomography, nondestructive evaluation

Procedia PDF Downloads 115