Search results for: electronic documents

803 Overview of E-government Adoption and Implementation in Ghana

Abstract:

E-government has been adopted and used by many governments/countries around the world including Ghana to provide citizens and businesses with more accurate, real-time, and high quality services and information. The objective of this paper is to present an overview of the Government of Ghana’s (GoG) adoption and implement of e-government and its usage by the Ministries, Departments and its agencies (MDAs) as well as other public sector institutions to deliver efficient public service to the general public i.e. citizens, business etc. Government implementation of e-government focused on facilitating effective delivery of government service to the public and ultimately to provide efficient government-wide electronic means of sharing information and knowledge through a network infrastructure developed to connect all major towns and cities, Ministries, Departments and Agencies and other public sector organizations in Ghana. One aim for the Government of Ghana use of ICT in public administration is to improve productivity in government administration and service by facilitating exchange of information to enable better interaction and coordination of work among MDAs, citizens and private businesses. The study was prepared using secondary sources of data from government policy documents, national and international published reports, journal articles, and web sources. This study indicates that through the e-government initiative, currently citizens and businesses can access and pay for services such as renewal of driving license, business registration, payment of taxes, acquisition of marriage and birth certificates as well as application for passport through the GoG electronic service (eservice) and electronic payment (epay) portal. Further, this study shows that there is enormous commitment from GoG to adopt and implement e-government as a tool not only to transform the business of government but also to bring efficiency in public services delivered by the MDAs. To ascertain this, a further study need to be carried out to determine if the use of e-government has brought about the anticipated improvements and efficiency in service delivery of MDAs and other state institutions in Ghana.

Keywords: Electronic government, electronic services, electronic payment, MDAs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4519

802 A New Approach to Annotate the Text's of the Websites and Documents with a Quite Comprehensive Knowledge Base

Authors: Mohammad Yasrebi, Mehran Mohsenzadeh, Mashalla Abbasi-Dezfuli

Abstract:

Machine-understandable data when strongly interlinked constitutes the basis for the SemanticWeb. Annotating web documents is one of the major techniques for creating metadata on the Web. Annotating websites defines the containing data in a form which is suitable for interpretation by machines. In this paper, we present a new approach to annotate websites and documents by promoting the abstraction level of the annotation process to a conceptual level. By this means, we hope to solve some of the problems of the current annotation solutions.

Keywords: Knowledge base, ontology, semantic annotation, semantic web.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1310

801 A Matlab / Simulink Based Tool for Power Electronic Circuits

Authors: Abdulatif A. M. Shaban

Abstract:

Transient simulation of power electronic circuits is of considerable interest to the designer. The switching nature of the devices used permits development of specialized algorithms which allow a considerable reduction in simulation time compared to general purpose simulation algorithms. This paper describes a method used to simulate a power electronic circuits using the SIMULINK toolbox within MATLAB software. Theoretical results are presented provides the basis of transient analysis of a power electronic circuits.

Keywords: Modelling, Simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5493

800 Localizing and Experiencing Electronic Questionnaires in an Educational Web Site

Authors: Theodore H. Kaskalis

Abstract:

One of the main research methods in humanistic studies is the collection and process of data through questionnaires. This paper reports our experiences of localizing and adapting the phpESP package of electronic surveys, which led to a friendly on-line questionnaire environment offered through our department web site. After presenting the characteristics of this environment, we identify the expected benefits and present a questionnaire carried out through both the traditional and electronic way. We present the respondents' feedback and then we report the researchers' opinions.Finally, we propose ideas we intend to implement in order to further assist and enhance the research based on this web accessed,electronic questionnaire environment.

Keywords: Electronic questionnaires, Computer assisted webinterviewing, Survey data collection, Survey data visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1239

799 Proposition for a New Approach of Version Control System Based On ECA Active Rules

Authors: S. Benhamed, S. Hocine, D. Benhamamouch

Abstract:

We try to give a solution of version control for documents in web service, that-s why we propose a new approach used specially for the XML documents. The new approach is applied in a centralized repository, this repository coexist with other repositories in a decentralized system. To achieve the activities of this approach in a standard model we use the ECA active rules. We also show how the Event-Condition-Action rules (ECA rules) have been incorporated as a mechanism for the version control of documents. The need to integrate ECA rules is that it provides a clear declarative semantics and induces an immediate operational realization in the system without the need for human intervention.

Keywords: ECA Rule, Web service, version control system, propagation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330

798 On the Interactive Search with Web Documents

Authors: Mario Kubek, Herwig Unger

Abstract:

Due to the large amount of information in the World Wide Web (WWW, web) and the lengthy and usually linearly ordered result lists of web search engines that do not indicate semantic relationships between their entries, the search for topically similar and related documents can become a tedious task. Especially, the process of formulating queries with proper terms representing specific information needs requires much effort from the user. This problem gets even bigger when the user's knowledge on a subject and its technical terms is not sufficient enough to do so. This article presents the new and interactive search application DocAnalyser that addresses this problem by enabling users to find similar and related web documents based on automatic query formulation and state-ofthe- art search word extraction. Additionally, this tool can be used to track topics across semantically connected web documents.

Keywords: DocAnalyser, interactive web search, search word extraction, query formulation, source topic detection, topic tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1609

797 Using Dempster-Shafer Theory in XML Information Retrieval

Authors: F. Raja, M. Rahgozar, F. Oroumchian

Abstract:

XML is a markup language which is becoming the standard format for information representation and data exchange. A major purpose of XML is the explicit representation of the logical structure of a document. Much research has been performed to exploit logical structure of documents in information retrieval in order to precisely extract user information need from large collections of XML documents. In this paper, we describe an XML information retrieval weighting scheme that tries to find the most relevant elements in XML documents in response to a user query. We present this weighting model for information retrieval systems that utilize plausible inferences to infer the relevance of elements in XML documents. We also add to this model the Dempster-Shafer theory of evidence to express the uncertainty in plausible inferences and Dempster-Shafer rule of combination to combine evidences derived from different inferences.

Keywords: Dempster-Shafer theory, plausible inferences, XMLinformation retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1482

796 Meta-Classification using SVM Classifiers for Text Documents

Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. In this paper, we investigated three approaches to build a meta-classifier in order to increase the classification accuracy. The basic idea is to learn a metaclassifier to optimally select the best component classifier for each data point. The experimental results show that combining classifiers can significantly improve the accuracy of classification and that our meta-classification strategy gives better results than each individual classifier. For 7083 Reuters text documents we obtained a classification accuracies up to 92.04%.

Keywords: Meta-classification, Learning with Kernels, Support Vector Machine, and Performance Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1565

795 Cantor Interpolating Spline to Design Electronic Mail Boxes

Authors: Adil Al-Rammahi

Abstract:

Electronic mail is very important in present time. Many researchers work for designing, improving, securing, fasting, goodness and others fields in electronic mail. This paper introduced new algorithm to use Cantor sets and cubic spline interpolating function in the electronic mail design. Cantor sets used as the area (or domain) of the mail, while spline function used for designing formula. The roots of spline function versus Cantor sets used as the controller admin. The roots calculated by the numerical Newton – Raphson's method. The result of this algorithm was promised.

Keywords: Cantor sets, spline, electronic mail design, Newton – Raphson's method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552

794 Signed Approach for Mining Web Content Outliers

Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi, G.V.Uma

Abstract:

The emergence of the Internet has brewed the revolution of information storage and retrieval. As most of the data in the web is unstructured, and contains a mix of text, video, audio etc, there is a need to mine information to cater to the specific needs of the users without loss of important hidden information. Thus developing user friendly and automated tools for providing relevant information quickly becomes a major challenge in web mining research. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent ones that are likely to contain outlying data such as noise, irrelevant and redundant data. This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers. This Signed approach gives the relevant web documents as well as outlying web documents. As the dictionary is organized based on the number of characters in a word, searching and retrieval of documents takes less time and less space.

Keywords: Outliers, Relevant document, , Signed Approach, Web content mining, Web documents..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2305

793 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents

Authors: Chothmal, Basant Agarwal

Abstract:

Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.

Keywords: Feature selection methods, Machine learning, NB, One-class SVM, Sentiment Analysis, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3248

792 ORank: An Ontology Based System for Ranking Documents

Authors: Mehrnoush Shamsfard, Azadeh Nematzadeh, Sarah Motiee

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1834

791 Considerations of Public Key Infrastructure (PKI), Functioning as a Chain of Trust in Electronic Payments Systems

Authors: Theodosios Tsiakis, George Stephanides, George Pekos

Abstract:

The growth of open networks created the interest to commercialise it. The establishment of an electronic business mechanism must be accompanied by a digital – electronic payment system to transfer the value of transactions. Financial organizations are requested to offer a secure e-payment synthesis with equivalent level of security served in conventional paper-based payment transactions. PKI, which is functioning as a chain of trust in security architecture, can enable security services of cryptography to epayments, in order to take advantage of the wider base either of customer or of trading partners and the reduction of cost transaction achieved by the use of Internet channels. The paper addresses the possibilities and the implementation suggestions of PKI in relevance to electronic payments by suggesting a framework that should be followed.

Keywords: Electronic Payment, Security, Trust

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1375

790 Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation

Authors: S. Logeswari, K. Premalatha

Abstract:

Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term based searches. Algorithms for semantic searches are implemented based on the relations exist between the words of the documents. Ontologies are used as domain knowledge for identifying the semantic relations as well as to structure the data for effective information retrieval. Annotation of data with concepts of ontology is one of the wide-ranging practices for clustering the documents. In this paper, indexing based on concept and annotation are proposed for clustering the biomedical documents. Fuzzy c-means (FCM) clustering algorithm is used to cluster the documents. The performances of the proposed methods are analyzed with traditional term based clustering for PubMed articles in five different diseases communities. The experimental results show that the proposed methods outperform the term based fuzzy clustering.

Keywords: MeSH Ontology, Concept Indexing, Annotation, semantic relations, Fuzzy c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2251

789 An Agent Oriented Architecture to Supply Dynamic Document Generation in ERP Systems

Authors: Hassan Haghighi, Seyedeh Zahra Hosseini, Seyedeh Elahe Jalambadani

Abstract:

One of the most important aspects expected from an ERP system is to mange user\administrator manual documents dynamically. Since an ERP package is frequently changed during its implementation in customer sites, it is often needed to add new documents and/or apply required changes to existing documents in order to cover new or changed capabilities. The worse is that since these changes occur continuously, the corresponding documents should be updated dynamically; otherwise, implementing the ERP package in the organization encounters serious risks. In this paper, we propose a new architecture which is based on the agent oriented vision and supplies the dynamic document generation expected from ERP systems using several independent but cooperative agents. Beside the dynamic document generation which is the main issue of this paper, the presented architecture will address some aspects of intelligence and learning capabilities existing in ERP.

Keywords: enterprise resource planning, dynamic documentgeneration, software architecture, agent oriented architecture, learning, intelligence

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1610

788 A Text Mining Technique Using Association Rules Extraction

Authors: Hany Mahgoub, Dietmar Rösner, Nabil Ismail, Fawzy Torkey

Abstract:

This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.

Keywords: Text mining, data mining, association rule mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4373

787 Word Stemming Algorithms and Retrieval Effectiveness in Malay and Arabic Documents Retrieval Systems

Authors: Tengku Mohd T. Sembok

Abstract:

Documents retrieval in Information Retrieval Systems (IRS) is generally about understanding of information in the documents concern. The more the system able to understand the contents of documents the more effective will be the retrieval outcomes. But understanding of the contents is a very complex task. Conventional IRS apply algorithms that can only approximate the meaning of document contents through keywords approach using vector space model. Keywords may be unstemmed or stemmed. When keywords are stemmed and conflated in retrieving process, we are a step forwards in applying semantic technology in IRS. Word stemming is a process in morphological analysis under natural language processing, before syntactic and semantic analysis. We have developed algorithms for Malay and Arabic and incorporated stemming in our experimental systems in order to measure retrieval effectiveness. The results have shown that the retrieval effectiveness has increased when stemming is used in the systems.

Keywords: Information Retrieval, Natural Language Processing, Artificial Intelligence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2217

786 A Recognition Method of Ancient Yi Script Based on Deep Learning

Authors: Shanxiong Chen, Xu Han, Xiaolong Wang, Hui Ma

Abstract:

Yi is an ethnic group mainly living in mainland China, with its own spoken and written language systems, after development of thousands of years. Ancient Yi is one of the six ancient languages in the world, which keeps a record of the history of the Yi people and offers documents valuable for research into human civilization. Recognition of the characters in ancient Yi helps to transform the documents into an electronic form, making their storage and spreading convenient. Due to historical and regional limitations, research on recognition of ancient characters is still inadequate. Thus, deep learning technology was applied to the recognition of such characters. Five models were developed on the basis of the four-layer convolutional neural network (CNN). Alpha-Beta divergence was taken as a penalty term to re-encode output neurons of the five models. Two fully connected layers fulfilled the compression of the features. Finally, at the softmax layer, the orthographic features of ancient Yi characters were re-evaluated, their probability distributions were obtained, and characters with features of the highest probability were recognized. Tests conducted show that the method has achieved higher precision compared with the traditional CNN model for handwriting recognition of the ancient Yi.

Keywords: Recognition, CNN, convolutional neural network, Yi character, divergence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 681

785 Combining Color and Layout Features for the Identification of Low-resolution Documents

Authors: Ardhendu Behera, Denis Lalanne, Rolf Ingold

Abstract:

This paper proposes a method, combining color and layout features, for identifying documents captured from lowresolution handheld devices. On one hand, the document image color density surface is estimated and represented with an equivalent ellipse and on the other hand, the document shallow layout structure is computed and hierarchically represented. The combined color and layout features are arranged in a symbolic file, which is unique for each document and is called the document-s visual signature. Our identification method first uses the color information in the signatures in order to focus the search space on documents having a similar color distribution, and finally selects the document having the most similar layout structure in the remaining search space. Finally, our experiment considers slide documents, which are often captured using handheld devices.

Keywords: Document color modeling, document visual signature, kernel density estimation, document identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1329

784 Providing a Secure, Reliable and Decentralized Document Management Solution Using Blockchain by a Virtual Identity Card

Authors: Meet Shah, Ankita Aditya, Dhruv Bindra, V. S. Omkar, Aashruti Seervi

Abstract:

In today's world, we need documents everywhere for a smooth workflow in the identification process or any other security aspects. The current system and techniques which are used for identification need one thing, that is ‘proof of existence’, which involves valid documents, for example, educational, financial, etc. The main issue with the current identity access management system and digital identification process is that the system is centralized in their network, which makes it inefficient. The paper presents the system which resolves all these cited issues. It is based on ‘blockchain’ technology, which is a 'decentralized system'. It allows transactions in a decentralized and immutable manner. The primary notion of the model is to ‘have everything with nothing’. It involves inter-linking required documents of a person with a single identity card so that a person can go anywhere without having the required documents with him/her. The person just needs to be physically present at a place wherein documents are necessary, and using a fingerprint impression and an iris scan print, the rest of the verification will progress. Furthermore, some technical overheads and advancements are listed. This paper also aims to layout its far-vision scenario of blockchain and its impact on future trends.

Keywords: Blockchain, decentralized system, fingerprint impression, identity management, iris scan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1214

783 Customers’ Intention to Use Electronic Payment System for Purchasing

Authors: Wanida Suwunniponth

Abstract:

The purpose of this research was to study the factors of characteristic of business, website quality and trust affected intention to use electronic payment systems for online purchasing. This survey research used questionnaire as a tool to collect the data of 300 customers who purchased online products and used an electronic payment system. The descriptive statistics and multiple regression analysis were used to analyze data. The results revealed that customers had a good opinion towards the characteristic of the business and website quality. However, they have a moderate opinion towards trust and intention to repurchase. In addition, the characteristics of the business affected the purchase intention the most, followed by website quality and the trust with statistical significance at 0.05 level. For particular, the terms of reputation, communication, information quality, perceived risk and word of mouth affected the intention to use the electronic payment system. In contrast, the terms of size, system quality and service quality did not affect intention to use an electronic payment system.

Keywords: Electronic payment, intention, online purchasing, trust.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2309

782 DocPro: A Framework for Processing Semantic and Layout Information in Business Documents

Authors: Ming-Jen Huang, Chun-Fang Huang, Chiching Wei

Abstract:

With the recent advance of the deep neural network, we observe new applications of NLP (natural language processing) and CV (computer vision) powered by deep neural networks for processing business documents. However, creating a real-world document processing system needs to integrate several NLP and CV tasks, rather than treating them separately. There is a need to have a unified approach for processing documents containing textual and graphical elements with rich formats, diverse layout arrangement, and distinct semantics. In this paper, a framework that fulfills this unified approach is presented. The framework includes a representation model definition for holding the information generated by various tasks and specifications defining the coordination between these tasks. The framework is a blueprint for building a system that can process documents with rich formats, styles, and multiple types of elements. The flexible and lightweight design of the framework can help build a system for diverse business scenarios, such as contract monitoring and reviewing.

Keywords: Document processing, framework, formal definition, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 575

781 Effect of Spatially Correlated Disorder on Electronic Transport Properties of Aperiodic Superlattices (GaAs/AlxGa1-xAs)

Authors: F. Bendahma, S. Bentata, S. Cherid, A. Zitouni, S. Terkhi, T. Lantri, Y. Sefir, Z. F. Meghoufel

Abstract:

We examine the electronic transport properties in Al_xGa_1-xAs/GaAs superlattices. Using the transfer-matrix technique and the exact Airy function formalism, we investigate theoretically the effect of structural parameters on the electronic energy spectra of trimer thickness barrier (TTB). Our numerical calculations showed that the localization length of the states becomes more extended when the disorder is correlated (trimer case). We have also found that the resonant tunneling time (RTT) is of the order of several femtoseconds.

Keywords: Electronic transport properties, structural parameters, superlattice, transfer-matrix technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 910

780 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Soo-Hyeon Jeon, Byeoung Kug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including large volumes of unstructured data and text have been created because of the rapid increase in the use of social media and the Internet. Usually, these documents are categorized for the convenience of users. Because the accuracy of manual categorization is not guaranteed, and such categorization requires a large amount of time and incurs huge costs. Many studies on automatic categorization have been conducted to help mitigate the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorize complex documents with multiple topics because they work on the assumption that individual documents can be categorized into single categories only. Therefore, to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, the learning process employed in these studies involves training using a multi-categorized document set. These methods therefore cannot be applied to the multi-categorization of most documents unless multi-categorized training sets using traditional multi-categorization algorithms are provided. To overcome this limitation, in this study, we review our novel methodology for extending the category of a single-categorized document to multiple categorizes, and then introduce a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: Big Data Analysis, Document Classification, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1706

779 Skew Detection Technique for Binary Document Images based on Hough Transform

Authors: Manjunath Aradhya V N, Hemantha Kumar G, Shivakumara P

Abstract:

Document image processing has become an increasingly important technology in the automation of office documentation tasks. During document scanning, skew is inevitably introduced into the incoming document image. Since the algorithm for layout analysis and character recognition are generally very sensitive to the page skew. Hence, skew detection and correction in document images are the critical steps before layout analysis. In this paper, a novel skew detection method is presented for binary document images. The method considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately. Several experiments have been conducted on various types of documents such as documents containing English Documents, Journals, Text-Book, Different Languages and Document with different fonts, Documents with different resolutions, to reveal the robustness of the proposed method. The experimental results revealed that the proposed method is accurate compared to the results of well-known existing methods.

Keywords: Optical Character Recognition, Skew angle, Thinning, Hough transform, Document processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2053

778 Feature Selection Methods for an Improved SVM Classifier

Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, three feature selection methods are evaluated: Random Selection, Information Gain (IG) and Support Vector Machine feature selection (called SVM_FS). We show that the best results were obtained with SVM_FS method for a relatively small dimension of the feature vector. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

Keywords: Feature Selection, Learning with Kernels, SupportVector Machine, and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1775

777 Exploring the Narrative Communication: Representing Visual Information from Digital Travel Stories

Authors: Rocío Abascal-Mena, Erick López-Ornelas

Abstract:

We present the results of a case study aiming to assess the reflection of the tourism community in the Web and its usability to propose new ways to communicate visually. The wealth of information contained in the Web and the clear facilities to communicate personals points of view makes of the social web a new space of exploration. In this way, social web allow the sharing of information between communities with similar interests. However, the tourism community remains unexplored as is the case of the information covered in travel stories. Along the Web, we find multiples sites allowing the users to communicate their experiences and personal points of view of a particular place of the world. This cultural heritage is found in multiple documents, usually very little supplemented with photos, so they are difficult to explore due to the lack of visual information. This paper explores the possibility of analyzing travel stories to display them visually on maps and generate new knowledge such as patterns of travel routes. This way, travel narratives published in electronic formats can be very important especially to the tourism community because of the great amount of knowledge that can be extracted. Our approach is based on the use of a Geoparsing Web Service to extract geographic coordinates from travel narratives in order to draw the geo-positions and link the documents into a map image.

Keywords: Social web, tourism community, visual communication, travel stories, geo references.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1599

776 A Proposed Approach for Emotion Lexicon Enrichment

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.

Keywords: Document analysis, sentimental analysis, emotion detection, WEKA tool, NRC Lexicon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1404

775 Evaluating the Effectiveness of Electronic Response Systems in Technology-Oriented Classes

Authors: Ahmad Salman

Abstract:

Electronic Response Systems such as Kahoot, Poll Everywhere, and Google Classroom are gaining a lot of popularity when surveying audiences in events, meetings, and classroom. The reason is mainly because of the ease of use and the convenience these tools bring since they provide mobile applications with a simple user interface. In this paper, we present a case study on the effectiveness of using Electronic Response Systems on student participation and learning experience in a classroom. We use a polling application for class exercises in two different technology-oriented classes. We evaluate the effectiveness of the usage of the polling applications through statistical analysis of the students performance in these two classes and compare them to the performances of students who took the same classes without using the polling application for class participation. Our results show an increase in the performances of the students who used the Electronic Response System when compared to those who did not by an average of 11%.

Keywords: Interactive learning, classroom technology, electronic response systems, polling applications, learning evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 594

774 Evaluation of Electronic Payment Systems Using Fuzzy Multi-Criteria Decision Making Approach

Authors: Gülfem Alptekin, S. Emre Alptekin

Abstract:

Global competitiveness has recently become the biggest concern of both manufacturing and service companies. Electronic commerce, as a key technology enables the firms to reach all the potential consumers from all over the world. In this study, we have presented commonly used electronic payment systems, and then we have shown the evaluation of these systems in respect to different criteria. The payment systems which are included in this research are the credit card, the virtual credit card, the electronic money, the mobile payment, the credit transfer and the debit instruments. We have realized a systematic comparison of these systems in respect to three main criteria: Technical, economical and social. We have conducted a fuzzy multi-criteria decision making procedure to deal with the multi-attribute nature of the problem. The subjectiveness and imprecision of the evaluation process are modeled using triangular fuzzy numbers.

Keywords: Electronic payment systems, fuzzy multi-criteriadecision making, analytical hierarchy process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887