Search results for: Nostratic languages
78 Influence and Dissemination of Solecism among Moroccan High School and University Students
Authors: Rachid Ed-Dali, Khalid Elasri
Abstract:
Mass media seem to provide a rich content for language acquisition. Exposure to television, the Internet, the mobile phone and other technological gadgets and devices helps enrich the student’s lexicon positively as well as negatively. The difficulties encountered by students while learning and acquiring second languages in addition to their eagerness to comprehend the content of a particular program prompt them to diversify their methods so as to achieve their targets. The present study highlights the significance of certain media channels and their involvement in language acquisition with the employment of the Natural Approach to further grasp whether students, especially secondary and high school students, learn and acquire errors through watching subtitled television programs. The chief objective is investigating the deductive and inductive relevance of certain programs beside the involvement of peripheral learning while acquiring mistakes.
Keywords: Errors, mistakes, natural Approach, peripheral learning, solecism.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 57177 Handwritten Character Recognition Using Multiscale Neural Network Training Technique
Authors: Velappa Ganapathy, Kok Leong Liew
Abstract:
Advancement in Artificial Intelligence has lead to the developments of various “smart" devices. Character recognition device is one of such smart devices that acquire partial human intelligence with the ability to capture and recognize various characters in different languages. Firstly multiscale neural training with modifications in the input training vectors is adopted in this paper to acquire its advantage in training higher resolution character images. Secondly selective thresholding using minimum distance technique is proposed to be used to increase the level of accuracy of character recognition. A simulator program (a GUI) is designed in such a way that the characters can be located on any spot on the blank paper in which the characters are written. The results show that such methods with moderate level of training epochs can produce accuracies of at least 85% and more for handwritten upper case English characters and numerals.Keywords: Character recognition, multiscale, backpropagation, neural network, minimum distance technique.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 192876 Abai Kunanbayev's Role in Enrichment of the Kazakh Language
Authors: Y.M. Paltore, B.N. Zhubatova, A.A. Mustafayeva
Abstract:
Abai Kunanbayev is famous for being enlightener, composer, interpreter, social agent, philosopher, reformer, who wanted to enrich Kazakh literature by emergence with Russian and European culture, and also as a founder of Kazakh written literary language. Abai Kunanbayev was born in 1845 in East Kazakhstan area and passed away in 1904 in his hometown. His oeuvre absorbed and reflected all changes in the life of Kazakh society of the second half of XIX century. Because ХІХ century, especially its second half, was an important transition period for Kazakhstan, which radically changed traditional way of Kazakh society and predetermined further development in consequence of activation of Russian colonial policy and approval of commodity-money relations in Steppe Land.Abai Kunanbayev, besides Arabic and Persian common words and loanwords from Quran in his words of edification, had used a lot of words of Arabic, Persian, Latin, Russian, Nogai, Shaghatai, Polish, Greek, Turkish, which are used in the Kazakh language.Keywords: Abai Kunanbayev, the Kazakh, Russian languages, literature
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 286475 The Code-Mixing of Japanese, English and Thai in Line Chat
Authors: Premvadee Na Nakornpanom
Abstract:
Code- mixing in spontaneous speech has been widely discussed, but not in virtual situations; especially in context of the third language learning students. Thus, this study is an attempt to explore the linguistic characteristics of the mixing of Japanese, English and Thai in a mobile Line chat room by students with their background of English as L2, Japanese as L3 and Thai as mother tongue. The result found that insertion of Thai content words is a very common linguistic phenomenon embedded with the other two languages in the sentences. As chatting is to be ‘relational’ or ‘interactional’, it affected the style of lexical choices to be speech-like, more personal and emotionally-related. A personal pronoun in Japanese is often mixed into the sentences. The Japanese sentence-final question particle か “ka” was added to the end of the sentence based on Thai grammar rules. Some unique characteristics were created while chatting.
Keywords: Code-mixing, Japanese, English, Thai, Line chat.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 344874 Recognition of Noisy Words Using the Time Delay Neural Networks Approach
Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha
Abstract:
This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.
Keywords: Neural networks, Noise, Speech Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 193673 Distillation Monitoring and Control using LabVIEW and SIMULINK Tools
Authors: J. Fernandez de Canete, P. Del Saz Orozco, S. Gonzalez-Perez
Abstract:
LabVIEW and SIMULINK are two most widely used graphical programming environments for designing digital signal processing and control systems. Unlike conventional text-based programming languages such as C, Cµ and MATLAB, graphical programming involves block-based code developments, allowing a more efficient mechanism to build and analyze control systems. In this paper a LabVIEW environment has been employed as a graphical user interface for monitoring the operation of a controlled distillation column, by visualizing both the closed loop performance and the user selected control conditions, while the column dynamics has been modeled under the SIMULINK environment. This tool has been applied to the PID based decoupled control of a binary distillation column. By means of such integrated environments the control designer is able to monitor and control the plant behavior and optimize the response when both, the quality improvement of distillation products and the operation efficiency tasks, are considered.Keywords: Distillation control, software tools, SIMULINKLabVIEWinterface.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 381672 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques
Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel
Abstract:
Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.
Keywords: Cross-language analysis, machine learning, machine translation, sentiment analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 166571 A Collaborative Platform for Multilingual Ontology Development
Authors: Ahmed Tawfik, Fausto Giunchiglia, Vincenzo Maltese
Abstract:
Ontologies provide a common understanding of a specific domain of interest that can be communicated between people and used as background knowledge for automated reasoning in a wide range of applications. In this paper, we address the design of multilingual ontologies following well-defined knowledge engineering methodologies with the support of novel collaborative development approaches. In particular, we present a collaborative platform which allows ontologies to be developed incrementally in multiple languages. This is made possible via an appropriate mapping between language independent concepts and one lexicalization per language (or a lexical gap in case such lexicalization does not exist). The collaborative platform has been designed to support the development of the Universal Knowledge Core, a multilingual ontology currently in English, Italian, Chinese, Mongolian, Hindi and Bangladeshi. Its design follows a workflow-based development methodology that models resources as a set of collaborative objects and assigns customizable workflows to build and maintain each collaborative object in a community driven manner, with extensive support of modern web 2.0 social and collaborative features.
Keywords: Knowledge Diversity, Knowledge Representation, Ontology Development.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 220470 Sounds Alike Name Matching for Myanmar Language
Authors: Yuzana, Khin Marlar Tun
Abstract:
Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.Keywords: natural language processing, name matching, phonetic matching
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 179869 The Effect of Culture on User Interface Design of Social Media - A Case Study on Preferences of Saudi Arabians on the Arabic User Interface of Facebook
Authors: Hana Almakky, Reza Sahandi, Jacqui Taylor
Abstract:
Social media continues to grow, and user interfaces may become more appealing if cultural characteristics are incorporated into their design. Facebook was designed in the west, and the original language was English. Subsequently, the words in the user interface were translated to other languages, including Arabic. Arabic words are written from right to left, and English is written from left to right. The translated version may misrepresent the original design and users’ preferences may be influenced by their culture, which should be considered in the user interface design. Previous research indicates that users are more comfortable when interacting with a user interface, which relates to their own culture. Therefore, this paper, using a survey, investigates the preferences of Saudi Arabians on the Arabic version of the user interface of Facebook.
Keywords: Culture, Facebook, Saudi Arabia, Social media, User Interface Design.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 365568 Studies on Determination of the Optimum Distance Between the Tmotes for Optimum Data Transfer in a Network with WLL Capability
Authors: N C Santhosh Kumar, N K Kishore
Abstract:
Using mini modules of Tmotes, it is possible to automate a small personal area network. This idea can be extended to large networks too by implementing multi-hop routing. Linking the various Tmotes using Programming languages like Nesc, Java and having transmitter and receiver sections, a network can be monitored. It is foreseen that, depending on the application, a long range at a low data transfer rate or average throughput may be an acceptable trade-off. To reduce the overall costs involved, an optimum number of Tmotes to be used under various conditions (Indoor/Outdoor) is to be deduced. By analyzing the data rates or throughputs at various locations of Tmotes, it is possible to deduce an optimal number of Tmotes for a specific network. This paper deals with the determination of optimum distances to reduce the cost and increase the reliability of the entire sensor network with Wireless Local Loop (WLL) capability.
Keywords: Average throughput, data rate, multi-hop routing, optimum data transfer, throughput, Tmotes, wireless local loop.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 136667 Multi Language Text Editor for Burushaski and Urdu through Unicode
Authors: Irfan Qadir Baig, Muhammad Sharif, Aman Ullah Khan
Abstract:
This paper introduces an isolated and unique ancient language Burushaski, spoken in Hunza, Nagar, Yasin and parts of Gilgit in the Northern Areas of Pakistan. It explains the working mechanism of Multi Language Text Editor for Urdu and Burushaski. It is developed under the use of ISO/IEC 10646 Unicode standards for Urdu and Burushaski open-type fonts. It gives an ample opportunity to this regional ancient language to have a modern Information technology for its promotion and preservation. The main objective of this research paper is to help preserve the heritage of such rare languages and give smart way of automation. It also facilitates to those who are interested in undertaking research on Burushaski or keen to trace fonatic relationship between the national Urdu language and Burushaski. Since this editor covers both Burushaski and Urdu so it can play an important role to introduce Burusho linguistic culture to the world at large. Precisely, as a result of this research paper, Burushaski publication through IT means would be possible.Keywords: Burushaski, Bri Naqsh, Unicode, Burusho, Hunza, Meshaski.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 210766 Emotional Analysis for Text Search Queries on Internet
Authors: Gemma García López
Abstract:
The goal of this study is to analyze if search queries carried out in search engines such as Google, can offer emotional information about the user that performs them. Knowing the emotional state in which the Internet user is located can be a key to achieve the maximum personalization of content and the detection of worrying behaviors. For this, two studies were carried out using tools with advanced natural language processing techniques. The first study determines if a query can be classified as positive, negative or neutral, while the second study extracts emotional content from words and applies the categorical and dimensional models for the representation of emotions. In addition, we use search queries in Spanish and English to establish similarities and differences between two languages. The results revealed that text search queries performed by users on the Internet can be classified emotionally. This allows us to better understand the emotional state of the user at the time of the search, which could involve adapting the technology and personalizing the responses to different emotional states.Keywords: Emotion classification, text search queries, emotional analysis, sentiment analysis in text, natural language processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 71365 Extending the Aspect Oriented Programming Joinpoint Model for Memory and Type Safety
Authors: Amjad Nusayr
Abstract:
Software security is a general term used to any type of software architecture or model in which security aspects are incorporated in this architecture. These aspects are not part of the main logic of the underlying program. Software security can be achieved using a combination of approaches including but not limited to secure software designs, third part component validation, and secure coding practices. Memory safety is one feature in software security where we ensure that any object in memory is have a valid pointer or a reference with a valid type. Aspect Oriented Programming (AOP) is a paradigm that is concerned with capturing the cross-cutting concerns in code development. AOP is generally used for common cross-cutting concerns like logging and Database transaction managing. In this paper we introduce the concepts that enable AOP to be used for the purpose of memory and type safety. We also present ideas for extending AOP in software security practices.
Keywords: Aspect oriented programming, programming languages, software security, memory and type safety.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 41564 Bottom Up Text Mining through Hierarchical Document Representation
Authors: Y. Djouadi., F. Souam.
Abstract:
Most of the existing text mining approaches are proposed, keeping in mind, transaction databases model. Thus, the mined dataset is structured using just one concept: the “transaction", whereas the whole dataset is modeled using the “set" abstract type. In such cases, the structure of the whole dataset and the relationships among the transactions themselves are not modeled and consequently, not considered in the mining process. We believe that taking into account structure properties of hierarchically structured information (e.g. textual document, etc ...) in the mining process, can leads to best results. For this purpose, an hierarchical associations rule mining approach for textual documents is proposed in this paper and the classical set-oriented mining approach is reconsidered profits to a Direct Acyclic Graph (DAG) oriented approach. Natural languages processing techniques are used in order to obtain the DAG structure. Based on this graph model, an hierarchical bottom up algorithm is proposed. The main idea is that each node is mined with its parent node.Keywords: Graph based association rules mining, Hierarchical document structure, Text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 205763 Development of Innovative Islamic Web Applications
Authors: Farrukh Shahzad
Abstract:
The rich Islamic resources related to religious text, Islamic sciences, and history are widely available in print and in electronic format online. However, most of these works are only available in Arabic language. In this research, an attempt is made to utilize these resources to create interactive web applications in Arabic, English and other languages. The system utilizes the Pattern Recognition, Knowledge Management, Data Mining, Information Retrieval and Management, Indexing, storage and data-analysis techniques to parse, store, convert and manage the information from authentic Arabic resources. These interactive web Apps provide smart multi-lingual search, tree based search, on-demand information matching and linking. In this paper, we provide details of application architecture, design, implementation and technologies employed. We also presented the summary of web applications already developed. We have also included some screen shots from the corresponding web sites. These web applications provide an Innovative On-line Learning Systems (eLearning and computer based education).Keywords: Islamic resources, Muslim scholars, hadith, narrators, history, fiqh.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 130262 Enhancing Operational Effectiveness in the Norwegian Army through Simulation-Based Training
Abstract:
The Norwegian Military Academy (Army) has initiated a project with the main ambition to explore possible avenues to enhancing operational effectiveness through an increased use of simulation-based training and exercises. Within a cost/benefit framework, we discuss opportunities and limitations of vertical and horizontal integration of the existing tactical training system. Vertical integration implies expanding the existing training system to span the full range of training from tactical level (platoon, company) to command and staff level (battalion, brigade). Horizontal integration means including other domains than army tactics and staff procedures in the training, such as military ethics, foreign languages, leadership and decision making. We discuss each of the integration options with respect to purpose and content of training, "best practice" for organising and conducting simulation-based training, and suggest how to evaluate training procedures and measure learning outcomes. We conclude by giving guidelines towards further explorative work and possible implementation.Keywords: Effectiveness, integration, simulation, training.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 143361 Ultra High Speed Approach for Document Skew Detection and Correction Based On Centre of Gravity
Authors: Seyyed Yasser Hashemi
Abstract:
Skew detection and correction (SDC) has a direct effect in efficiency and exactitude of documents’ segmentation and analysis and thus is considered as a very important step in documents’ analysis field. Skew is a major problem in documents’ analysis for every language. For Arabic/Persian document scripts this problem is more severe because of special features of these languages. In this paper an efficient and fast algorithm for Document Skew Detection (DSD) based on the concept of segmentation and Center of Gravity (COG) is proposed. This algorithm is examined for 150 Arabic/Persian and English documents and SDC process are done successfully for 93 percent of documents with error rate of less than 1°. This algorithm shows better results for English documents compared to Arabic/Persian documents. The proposed method is also represents favorable results for handwritten, printed and also complicated documents such as newspapers and journals even with very low quality and resolution.
Keywords: Arabic/Persian document, Baseline, Centre of gravity, Document segmentation, Skew detection and correction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 191160 Computer Proven Correctness of the Rabin Public-Key Scheme
Authors: Johannes Buchmann, Markus Kaiser
Abstract:
We decribe a formal specification and verification of the Rabin public-key scheme in the formal proof system Is-abelle/HOL. The idea is to use the two views of cryptographic verification: the computational approach relying on the vocabulary of probability theory and complexity theory and the formal approach based on ideas and techniques from logic and programming languages. The analysis presented uses a given database to prove formal properties of our implemented functions with computer support. Thema in task in designing a practical formalization of correctness as well as security properties is to cope with the complexity of cryptographic proving. We reduce this complexity by exploring a light-weight formalization that enables both appropriate formal definitions as well as eficient formal proofs. This yields the first computer-proved implementation of the Rabin public-key scheme in Isabelle/HOL. Consequently, we get reliable proofs with a minimal error rate augmenting the used database. This provides a formal basis for more computer proof constructions in this area.Keywords: public-key encryption, Rabin public-key scheme, formalproof system, higher-order logic, formal verification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 159159 Bibliometric Analysis of the Research Progress on Graphene Inks from 2008 to 2018
Authors: Jean C. A. Sousa, Julio Cesar Maciel Santos, Andressa J. Rubio, Edneia A. S. Paccola, Natália U. Yamaguchi
Abstract:
A bibliometric analysis in the Web of Science database was used to identify overall scientific results of graphene inks to date (2008 to 2018). The objective of this study was to evaluate the evolutionary tendency of graphene inks research and to identify its aspects, aiming to provide data that can guide future work. The contributions of different researches, languages, thematic categories, periodicals, place of publication, institutes, funding agencies, articles cited and applications were analyzed. The results revealed a growing number of annual publications, of 258 papers found, 107 were included because they met the inclusion criteria. Three main applications were identified: synthesis and characterization, electronics and surfaces. The most relevant research on graphene inks has been summarized in this article, and graphene inks for electronic devices presented the most incident theme according to the research trends during the studied period. It is estimated that this theme will remain in evidence and will contribute to the direction of future research in this area.
Keywords: Bibliometric, coating, nanomaterials, scientometrics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 100858 OCR for Script Identification of Hindi (Devnagari) Numerals using Feature Sub Selection by Means of End-Point with Neuro-Memetic Model
Authors: Banashree N. P., R. Vasanta
Abstract:
Recognition of Indian languages scripts is challenging problems. In Optical Character Recognition [OCR], a character or symbol to be recognized can be machine printed or handwritten characters/numerals. There are several approaches that deal with problem of recognition of numerals/character depending on the type of feature extracted and different way of extracting them. This paper proposes a recognition scheme for handwritten Hindi (devnagiri) numerals; most admired one in Indian subcontinent. Our work focused on a technique in feature extraction i.e. global based approach using end-points information, which is extracted from images of isolated numerals. These feature vectors are fed to neuro-memetic model [18] that has been trained to recognize a Hindi numeral. The archetype of system has been tested on varieties of image of numerals. . In proposed scheme data sets are fed to neuro-memetic algorithm, which identifies the rule with highest fitness value of nearly 100 % & template associates with this rule is nothing but identified numerals. Experimentation result shows that recognition rate is 92-97 % compared to other models.Keywords: OCR, Global Feature, End-Points, Neuro-Memetic model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 175957 Feature-Based Summarizing and Ranking from Customer Reviews
Authors: Dim En Nyaung, Thin Lai Lai Thein
Abstract:
Due to the rapid increase of Internet, web opinion sources dynamically emerge which is useful for both potential customers and product manufacturers for prediction and decision purposes. These are the user generated contents written in natural languages and are unstructured-free-texts scheme. Therefore, opinion mining techniques become popular to automatically process customer reviews for extracting product features and user opinions expressed over them. Since customer reviews may contain both opinionated and factual sentences, a supervised machine learning technique applies for subjectivity classification to improve the mining performance. In this paper, we dedicate our work is the task of opinion summarization. Therefore, product feature and opinion extraction is critical to opinion summarization, because its effectiveness significantly affects the identification of semantic relationships. The polarity and numeric score of all the features are determined by Senti-WordNet Lexicon. The problem of opinion summarization refers how to relate the opinion words with respect to a certain feature. Probabilistic based model of supervised learning will improve the result that is more flexible and effective.
Keywords: Opinion Mining, Opinion Summarization, Sentiment Analysis, Text Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 293356 Change Management in Business Process Modeling Based on Object Oriented Petri Net
Authors: Bassam Atieh Rajabi, Sai Peck Lee
Abstract:
Business Process Modeling (BPM) is the first and most important step in business process management lifecycle. Graph based formalism and rule based formalism are the two most predominant formalisms on which process modeling languages are developed. BPM technology continues to face challenges in coping with dynamic business environments where requirements and goals are constantly changing at the execution time. Graph based formalisms incur problems to react to dynamic changes in Business Process (BP) at the runtime instances. In this research, an adaptive and flexible framework based on the integration between Object Oriented diagramming technique and Petri Net modeling language is proposed in order to support change management techniques for BPM and increase the representation capability for Object Oriented modeling for the dynamic changes in the runtime instances. The proposed framework is applied in a higher education environment to achieve flexible, updatable and dynamic BP.Keywords: Business Process Modeling, Change Management, Graph Based Modeling, Rule Based Modeling, Object Oriented PetriNet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 203855 Powerful Tool to Expand Business Intelligence: Text Mining
Authors: Li Gao, Elizabeth Chang, Song Han
Abstract:
With the extensive inclusion of document, especially text, in the business systems, data mining does not cover the full scope of Business Intelligence. Data mining cannot deliver its impact on extracting useful details from the large collection of unstructured and semi-structured written materials based on natural languages. The most pressing issue is to draw the potential business intelligence from text. In order to gain competitive advantages for the business, it is necessary to develop the new powerful tool, text mining, to expand the scope of business intelligence. In this paper, we will work out the strong points of text mining in extracting business intelligence from huge amount of textual information sources within business systems. We will apply text mining to each stage of Business Intelligence systems to prove that text mining is the powerful tool to expand the scope of BI. After reviewing basic definitions and some related technologies, we will discuss the relationship and the benefits of these to text mining. Some examples and applications of text mining will also be given. The motivation behind is to develop new approach to effective and efficient textual information analysis. Thus we can expand the scope of Business Intelligence using the powerful tool, text mining.Keywords: Business intelligence, document warehouse, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 266054 Skew Detection Technique for Binary Document Images based on Hough Transform
Authors: Manjunath Aradhya V N, Hemantha Kumar G, Shivakumara P
Abstract:
Document image processing has become an increasingly important technology in the automation of office documentation tasks. During document scanning, skew is inevitably introduced into the incoming document image. Since the algorithm for layout analysis and character recognition are generally very sensitive to the page skew. Hence, skew detection and correction in document images are the critical steps before layout analysis. In this paper, a novel skew detection method is presented for binary document images. The method considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately. Several experiments have been conducted on various types of documents such as documents containing English Documents, Journals, Text-Book, Different Languages and Document with different fonts, Documents with different resolutions, to reveal the robustness of the proposed method. The experimental results revealed that the proposed method is accurate compared to the results of well-known existing methods.Keywords: Optical Character Recognition, Skew angle, Thinning, Hough transform, Document processing
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 209553 Humanoid Personalized Avatar Through Multiple Natural Language Processing
Authors: Jin Hou, Xia Wang, Fang Xu, Viet Dung Nguyen, Ling Wu
Abstract:
There has been a growing interest in implementing humanoid avatars in networked virtual environment. However, most existing avatar communication systems do not take avatars- social backgrounds into consideration. This paper proposes a novel humanoid avatar animation system to represent personalities and facial emotions of avatars based on culture, profession, mood, age, taste, and so forth. We extract semantic keywords from the input text through natural language processing, and then the animations of personalized avatars are retrieved and displayed according to the order of the keywords. Our primary work is focused on giving avatars runtime instruction from multiple natural languages. Experiments with Chinese, Japanese and English input based on the prototype show that interactive avatar animations can be displayed in real time and be made available online. This system provides a more natural and interesting means of human communication, and therefore is expected to be used for cross-cultural communication, multiuser online games, and other entertainment applications.
Keywords: personalized avatar, mutiple natural luanguage processing, social backgrounds, anmimation, human computer interaction
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 197052 BugCatcher.Net: Detecting Bugs and Proposing Corrective Solutions
Authors: Sheetal Chavan, P. J. Kulkarni, Vivek Shanbhag
Abstract:
Although achieving zero-defect software release is practically impossible, software industries should take maximum care to detect defects/bugs well ahead in time allowing only bare minimums to creep into released version. This is a clear indicator of time playing an important role in the bug detection. In addition to this, software quality is the major factor in software engineering process. Moreover, early detection can be achieved only through static code analysis as opposed to conventional testing. BugCatcher.Net is a static analysis tool, which detects bugs in .NET® languages through MSIL (Microsoft Intermediate Language) inspection. The tool utilizes a Parser based on Finite State Automata to carry out bug detection. After being detected, bugs need to be corrected immediately. BugCatcher.Net facilitates correction, by proposing a corrective solution for reported warnings/bugs to end users with minimum side effects. Moreover, the tool is also capable of analyzing the bug trend of a program under inspection.Keywords: Dependence, Early solution, Finite State Automata, Grammar, Late solution, Parser State Transition Diagram, StaticProgram Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 151051 An Ontology Based Question Answering System on Software Test Document Domain
Authors: Meltem Serhatli, Ferda N. Alpaslan
Abstract:
Processing the data by computers and performing reasoning tasks is an important aim in Computer Science. Semantic Web is one step towards it. The use of ontologies to enhance the information by semantically is the current trend. Huge amount of domain specific, unstructured on-line data needs to be expressed in machine understandable and semantically searchable format. Currently users are often forced to search manually in the results returned by the keyword-based search services. They also want to use their native languages to express what they search. In this paper, an ontology-based automated question answering system on software test documents domain is presented. The system allows users to enter a question about the domain by means of natural language and returns exact answer of the questions. Conversion of the natural language question into the ontology based query is the challenging part of the system. To be able to achieve this, a new algorithm regarding free text to ontology based search engine query conversion is proposed. The algorithm is based on investigation of suitable question type and parsing the words of the question sentence.Keywords: Description Logics, ontology, question answering, reasoning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 214950 OCR for Script Identification of Hindi (Devnagari) Numerals using Error Diffusion Halftoning Algorithm with Neural Classifier
Authors: Banashree N. P., Andhe Dharani, R. Vasanta, P. S. Satyanarayana
Abstract:
The applications on numbers are across-the-board that there is much scope for study. The chic of writing numbers is diverse and comes in a variety of form, size and fonts. Identification of Indian languages scripts is challenging problems. In Optical Character Recognition [OCR], machine printed or handwritten characters/numerals are recognized. There are plentiful approaches that deal with problem of detection of numerals/character depending on the sort of feature extracted and different way of extracting them. This paper proposes a recognition scheme for handwritten Hindi (devnagiri) numerals; most admired one in Indian subcontinent our work focused on a technique in feature extraction i.e. Local-based approach, a method using 16-segment display concept, which is extracted from halftoned images & Binary images of isolated numerals. These feature vectors are fed to neural classifier model that has been trained to recognize a Hindi numeral. The archetype of system has been tested on varieties of image of numerals. Experimentation result shows that recognition rate of halftoned images is 98 % compared to binary images (95%).
Keywords: OCR, Halftoning, Neural classifier, 16-segmentdisplay concept.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 171649 A Recognition Method of Ancient Yi Script Based on Deep Learning
Authors: Shanxiong Chen, Xu Han, Xiaolong Wang, Hui Ma
Abstract:
Yi is an ethnic group mainly living in mainland China, with its own spoken and written language systems, after development of thousands of years. Ancient Yi is one of the six ancient languages in the world, which keeps a record of the history of the Yi people and offers documents valuable for research into human civilization. Recognition of the characters in ancient Yi helps to transform the documents into an electronic form, making their storage and spreading convenient. Due to historical and regional limitations, research on recognition of ancient characters is still inadequate. Thus, deep learning technology was applied to the recognition of such characters. Five models were developed on the basis of the four-layer convolutional neural network (CNN). Alpha-Beta divergence was taken as a penalty term to re-encode output neurons of the five models. Two fully connected layers fulfilled the compression of the features. Finally, at the softmax layer, the orthographic features of ancient Yi characters were re-evaluated, their probability distributions were obtained, and characters with features of the highest probability were recognized. Tests conducted show that the method has achieved higher precision compared with the traditional CNN model for handwriting recognition of the ancient Yi.
Keywords: Recognition, CNN, convolutional neural network, Yi character, divergence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 747