Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 45

natural language processing Related Abstracts

45 Implementing a Database from a Requirement Specification

Authors: M. Omer, D. Wilson

Abstract:

Creating a database scheme is essentially a manual process. From a requirement specification, the information contained within has to be analyzed and reduced into a set of tables, attributes and relationships. This is a time-consuming process that has to go through several stages before an acceptable database schema is achieved. The purpose of this paper is to implement a Natural Language Processing (NLP) based tool to produce a from a requirement specification. The Stanford CoreNLP version 3.3.1 and the Java programming were used to implement the proposed model. The outcome of this study indicates that the first draft of a relational database schema can be extracted from a requirement specification by using NLP tools and techniques with minimum user intervention. Therefore, this method is a step forward in finding a solution that requires little or no user intervention.

Keywords: natural language processing, information extraction, relation extraction

Procedia PDF Downloads 97
44 Text Analysis to Support Structuring and Modelling a Public Policy Problem-Outline of an Algorithm to Extract Inferences from Textual Data

Authors: Osama Ibrahim, Claudia Ehrentraut, Hercules Dalianis

Abstract:

Policy making situations are real-world problems that exhibit complexity in that they are composed of many interrelated problems and issues. To be effective, policies must holistically address the complexity of the situation rather than propose solutions to single problems. Formulating and understanding the situation and its complex dynamics, therefore, is a key to finding holistic solutions. Analysis of text based information on the policy problem, using Natural Language Processing (NLP) and Text analysis techniques, can support modelling of public policy problem situations in a more objective way based on domain experts knowledge and scientific evidence. The objective behind this study is to support modelling of public policy problem situations, using text analysis of verbal descriptions of the problem. We propose a formal methodology for analysis of qualitative data from multiple information sources on a policy problem to construct a causal diagram of the problem. The analysis process aims at identifying key variables, linking them by cause-effect relationships and mapping that structure into a graphical representation that is adequate for designing action alternatives, i.e., policy options. This study describes the outline of an algorithm used to automate the initial step of a larger methodological approach, which is so far done manually. In this initial step, inferences about key variables and their interrelationships are extracted from textual data to support a better problem structuring. A small prototype for this step is also presented.

Keywords: Public Policy, Algorithm, natural language processing, Qualitative Analysis, Problem Structuring, inference extraction

Procedia PDF Downloads 450
43 Role of Natural Language Processing in Information Retrieval; Challenges and Opportunities

Authors: Khaled M. Alhawiti

Abstract:

This paper aims to analyze the role of natural language processing (NLP). The paper will discuss the role in the context of automated data retrieval, automated question answer, and text structuring. NLP techniques are gaining wider acceptance in real life applications and industrial concerns. There are various complexities involved in processing the text of natural language that could satisfy the need of decision makers. This paper begins with the description of the qualities of NLP practices. The paper then focuses on the challenges in natural language processing. The paper also discusses major techniques of NLP. The last section describes opportunities and challenges for future research.

Keywords: Information Retrieval, natural language processing, data retrieval, text structuring

Procedia PDF Downloads 177
42 Application of Natural Language Processing in Education

Authors: Khaled M. Alhawiti

Abstract:

Reading capability is a major segment of language competency. On the other hand, discovering topical writings at a fitting level for outside and second language learners is a test for educators. We address this issue utilizing natural language preparing innovation to survey reading level and streamline content. In the connection of outside and second-language learning, existing measures of reading level are not appropriate to this errand. Related work has demonstrated the profit of utilizing measurable language preparing procedures; we expand these thoughts and incorporate other potential peculiarities to measure intelligibility. In the first piece of this examination, we join characteristics from measurable language models, customary reading level measures and other language preparing apparatuses to deliver a finer technique for recognizing reading level. We examine the execution of human annotators and assess results for our finders concerning human appraisals. A key commitment is that our identifiers are trainable; with preparing and test information from the same space, our finders beat more general reading level instruments (Flesch-Kincaid and Lexile). Trainability will permit execution to be tuned to address the needs of specific gatherings or understudies.

Keywords: Education, natural language processing, trainability, syntactic simplification tools

Procedia PDF Downloads 364
41 Impact of Natural Language Processing in Educational Setting: An Effective Approach towards Improved Learning

Authors: Khaled M. Alhawiti

Abstract:

Natural Language Processing (NLP) is an effective approach for bringing improvement in educational setting. This involves initiating the process of learning through the natural acquisition in the educational systems. It is based on following effective approaches for providing the solution for various problems and issues in education. Natural Language Processing provides solution in a variety of different fields associated with the social and cultural context of language learning. It is based on involving various tools and techniques such as grammar, syntax, and structure of text. It is effective approach for teachers, students, authors, and educators for providing assistance for writing, analysis, and assessment procedure. Natural Language Processing is widely integrated in the large number of educational contexts such as research, science, linguistics, e-learning, evaluations system, and various other educational settings such as schools, higher education system, and universities. Natural Language Processing is based on applying scientific approach in the educational settings. In the educational settings, NLP is an effective approach to ensure that students can learn easily in the same way as they acquired language in the natural settings.

Keywords: Education, e-Learning, natural language processing, Application, educational system, scientific studies

Procedia PDF Downloads 353
40 An Overview of the SIAFIM Connected Resources

Authors: Angela Ioniţâ, Tiberiu Boros, Maria Visan

Abstract:

Wildfires are one of the frequent and uncontrollable phenomena that currently affect large areas of the world where the climate, geographic and social conditions make it impossible to prevent and control such events. In this paper we introduce the ground concepts that lie behind the SIAFIM (Satellite Image Analysis for Fire Monitoring) project in order to create a context and we introduce a set of newly created tools that are external to the project but inherently in interventions and complex decision making based on geospatial information and spatial data infrastructures.

Keywords: Communication, Mobile Applications, natural language processing, GPS, wildfire, forest fire

Procedia PDF Downloads 425
39 Tracking and Classifying Client Interactions with Personal Coaches

Authors: Kartik Thakore, Anna-Roza Tamas, Adam Cole

Abstract:

The world health organization (WHO) reports that by 2030 more than 23.7 million deaths annually will be caused by Cardiovascular Diseases (CVDs); with a 2008 economic impact of $3.76 T. Metabolic syndrome is a disorder of multiple metabolic risk factors strongly indicated in the development of cardiovascular diseases. Guided lifestyle intervention driven by live coaching has been shown to have a positive impact on metabolic risk factors. Individuals’ path to improved (decreased) metabolic risk factors are driven by personal motivation and personalized messages delivered by coaches and augmented by technology. Using interactions captured between 400 individuals and 3 coaches over a program period of 500 days, a preliminary model was designed. A novel real time event tracking system was created to track and classify clients based on their genetic profile, baseline questionnaires and usage of a mobile application with live coaching sessions. Classification of clients and coaches was done using a support vector machines application build on Apache Spark, Stanford Natural Language Processing Library (SNLPL) and decision-modeling.

Keywords: natural language processing, guided lifestyle intervention, metabolic risk factors, personal coaching, support vector machines application, Apache Spark

Procedia PDF Downloads 315
38 Natural Language Processing; the Future of Clinical Record Management

Authors: Khaled M. Alhawiti

Abstract:

This paper investigates the future of medicine and the use of Natural language processing. The importance of having correct clinical information available online is remarkable; improving patient care at affordable costs could be achieved using automated applications to use the online clinical information. The major challenge towards the retrieval of such vital information is to have it appropriately coded. Majority of the online patient reports are not found to be coded and not accessible as its recorded in natural language text. The use of Natural Language processing provides a feasible solution by retrieving and organizing clinical information, available in text and transforming clinical data that is available for use. Systems used in NLP are rather complex to construct, as they entail considerable knowledge, however significant development has been made. Newly formed NLP systems have been tested and have established performance that is promising and considered as practical clinical applications.

Keywords: Information Retrieval, natural language processing, clinical information, automated applications

Procedia PDF Downloads 232
37 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 220
36 Morpheme Based Parts of Speech Tagger for Kannada Language

Authors: M. C. Padma, R. J. Prathibha

Abstract:

Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.

Keywords: natural language processing, paradigms, hierarchical tag set, morphological analyzer, parts of speech

Procedia PDF Downloads 148
35 Searching Linguistic Synonyms through Parts of Speech Tagging

Authors: Usman Qamar, Faiza Hussain

Abstract:

Synonym-based searching is recognized to be a complicated problem as text mining from unstructured data of web is challenging. Finding useful information which matches user need from bulk of web pages is a cumbersome task. In this paper, a novel and practical synonym retrieval technique is proposed for addressing this problem. For replacement of semantics, user intent is taken into consideration to realize the technique. Parts-of-Speech tagging is applied for pattern generation of the query and a thesaurus for this experiment was formed and used. Comparison with Non-Context Based Searching, Context Based searching proved to be a more efficient approach while dealing with linguistic semantics. This approach is very beneficial in doing intent based searching. Finally, results and future dimensions are presented.

Keywords: Semantics, Information Retrieval, Text Mining, natural language processing, Grammar, parts-of-speech tagging

Procedia PDF Downloads 155
34 Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts

Authors: Henda Ben Ghezala, Yassine Jamoussi, Ameni Youssfi

Abstract:

With the growing interest in social networking, the interaction of social actors evolved to a source of knowledge in which it becomes possible to perform context aware-reasoning. The information extraction from social networking especially Twitter and Facebook is one of the problems in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit from the lexical features developed for Twitter and online conversational text in previous works, and to develop an extraction model for constructing a huge knowledge based on actions

Keywords: Social Networking, natural language processing, information extraction, part-of-speech tagging

Procedia PDF Downloads 178
33 Correlation between Funding and Publications: A Pre-Step towards Future Research Prediction

Authors: Ning Kang, Marius Doornenbal

Abstract:

Funding is a very important – if not crucial – resource for research projects. Usually, funding organizations will publish a description of the funded research to describe the scope of the funding award. Logically, we would expect research outcomes to align with this funding award. For that reason, we might be able to predict future research topics based on present funding award data. That said, it remains to be shown if and how future research topics can be predicted by using the funding information. In this paper, we extract funding project information and their generated paper abstracts from the Gateway to Research database as a group, and use the papers from the same domains and publication years in the Scopus database as a baseline comparison group. We annotate both the project awards and the papers resulting from the funded projects with linguistic features (noun phrases), and then calculate tf-idf and cosine similarity between these two set of features. We show that the cosine similarity between the project-generated papers group is bigger than the project-baseline group, and also that these two groups of similarities are significantly different. Based on this result, we conclude that the funding information actually correlates with the content of future research output for the funded project on the topical level. How funding really changes the course of science or of scientific careers remains an elusive question.

Keywords: natural language processing, TF-IDF, cosine similarity, noun phrase

Procedia PDF Downloads 117
32 Context Detection in Spreadsheets Based on Automatically Inferred Table Schema

Authors: Alexander Wachtel, Michael T. Franzen, Walter F. Tichy

Abstract:

Programming requires years of training. With natural language and end user development methods, programming could become available to everyone. It enables end users to program their own devices and extend the functionality of the existing system without any knowledge of programming languages. In this paper, we describe an Interactive Spreadsheet Processing Module (ISPM), a natural language interface to spreadsheets that allows users to address ranges within the spreadsheet based on inferred table schema. Using the ISPM, end users are able to search for values in the schema of the table and to address the data in spreadsheets implicitly. Furthermore, it enables them to select and sort the spreadsheet data by using natural language. ISPM uses a machine learning technique to automatically infer areas within a spreadsheet, including different kinds of headers and data ranges. Since ranges can be identified from natural language queries, the end users can query the data using natural language. During the evaluation 12 undergraduate students were asked to perform operations (sum, sort, group and select) using the system and also Excel without ISPM interface, and the time taken for task completion was compared across the two systems. Only for the selection task did users take less time in Excel (since they directly selected the cells using the mouse) than in ISPM, by using natural language for end user software engineering, to overcome the present bottleneck of professional developers.

Keywords: natural language processing, Human Computer Interaction, Natural language interfaces, Dialog Systems, spreadsheet, end user development, data recognition

Procedia PDF Downloads 185
31 A Greedy Alignment Algorithm Supporting Medication Reconciliation

Authors: David Tresner-Kirsch

Abstract:

Reconciling patient medication lists from multiple sources is a critical task supporting the safe delivery of patient care. Manual reconciliation is a time-consuming and error-prone process, and recently attempts have been made to develop efficiency- and safety-oriented automated support for professionals performing the task. An important capability of any such support system is automated alignment – finding which medications from a list correspond to which medications from a different source, regardless of misspellings, naming differences (e.g. brand name vs. generic), or changes in treatment (e.g. switching a patient from one antidepressant class to another). This work describes a new algorithmic solution to this alignment task, using a greedy matching approach based on string similarity, edit distances, concept extraction and normalization, and synonym search derived from the RxNorm nomenclature. The accuracy of this algorithm was evaluated against a gold-standard corpus of 681 medication records; this evaluation found that the algorithm predicted alignments with 99% precision and 91% recall. This performance is sufficient to support decision support applications for medication reconciliation.

Keywords: natural language processing, Clinical Decision Support, medication reconciliation, RxNorm

Procedia PDF Downloads 185
30 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: natural language processing, hidden Markov model, POS tagging, viterbi algorithm

Procedia PDF Downloads 216
29 Information Extraction for Short-Answer Question for the University of the Cordilleras

Authors: Thelma Palaoag, Melanie Basa, Jezreel Mark Panilo

Abstract:

Checking short-answer questions and essays, whether it may be paper or electronic in form, is a tiring and tedious task for teachers. Evaluating a student’s output require wide array of domains. Scoring the work is often a critical task. Several attempts in the past few years to create an automated writing assessment software but only have received negative results from teachers and students alike due to unreliability in scoring, does not provide feedback and others. The study aims to create an application that will be able to check short-answer questions which incorporate information extraction. Information extraction is a subfield of Natural Language Processing (NLP) where a chunk of text (technically known as unstructured text) is being broken down to gather necessary bits of data and/or keywords (structured text) to be further analyzed or rather be utilized by query tools. The proposed system shall be able to extract keywords or phrases from the individual’s answers to match it into a corpora of words (as defined by the instructor), which shall be the basis of evaluation of the individual’s answer. The proposed system shall also enable the teacher to provide feedback and re-evaluate the output of the student for some writing elements in which the computer cannot fully evaluate such as creativity and logic. Teachers can formulate, design, and check short answer questions efficiently by defining keywords or phrases as parameters by assigning weights for checking answers. With the proposed system, teacher’s time in checking and evaluating students output shall be lessened, thus, making the teacher more productive and easier.

Keywords: natural language processing, Application, information extraction, short-answer question

Procedia PDF Downloads 312
28 Digi-Buddy: A Smart Cane with Artificial Intelligence and Real-Time Assistance

Authors: Amaladhithyan Krishnamoorthy, Ruvaitha Banu

Abstract:

Vision is considered as the most important sense in humans, without which leading a normal can be often difficult. There are many existing smart canes for visually impaired with obstacle detection using ultrasonic transducer to help them navigate. Though the basic smart cane increases the safety of the users, it does not help in filling the void of visual loss. This paper introduces the concept of Digi-Buddy which is an evolved smart cane for visually impaired. The cane consists for several modules, apart from the basic obstacle detection features; the Digi-Buddy assists the user by capturing video/images and streams them to the server using a wide-angled camera, which then detects the objects using Deep Convolutional Neural Network. In addition to determining what the particular image/object is, the distance of the object is assessed by the ultrasonic transducer. The sound generation application, modelled with the help of Natural Language Processing is used to convert the processed images/object into audio. The object detected is signified by its name which is transmitted to the user with the help of Bluetooth hear phones. The object detection is extended to facial recognition which maps the faces of the person the user meets in the database of face images and alerts the user about the person. One of other crucial function consists of an automatic-intimation-alarm which is triggered when the user is in an emergency. If the user recovers within a set time, a button is provisioned in the cane to stop the alarm. Else an automatic intimation is sent to friends and family about the whereabouts of the user using GPS. In addition to safety and security by the existing smart canes, the proposed concept devices to be implemented as a prototype helping visually-impaired visualize their surroundings through audio more in an amicable way.

Keywords: Artificial Intelligence, Internet of Things, natural language processing, Facial Recognition

Procedia PDF Downloads 190
27 Sarcasm Recognition System Using Hybrid Tone-Word Spotting Audio Mining Technique

Authors: Sandhya Baskaran, Hari Kumar Nagabushanam

Abstract:

Sarcasm sentiment recognition is an area of natural language processing that is being probed into in the recent times. Even with the advancements in NLP, typical translations of words, sentences in its context fail to provide the exact information on a sentiment or emotion of a user. For example, if something bad happens, the statement ‘That's just what I need, great! Terrific!’ is expressed in a sarcastic tone which could be misread as a positive sign by any text-based analyzer. In this paper, we are presenting a unique real time ‘word with its tone’ spotting technique which would provide the sentiment analysis for a tone or pitch of a voice in combination with the words being expressed. This hybrid approach increases the probability for identification of special sentiment like sarcasm much closer to the real world than by mining text or speech individually. The system uses a tone analyzer such as YIN-FFT which extracts pitch segment-wise that would be used in parallel with a speech recognition system. The clustered data is classified for sentiments and sarcasm score for each of it determined. Our Simulations demonstrates the improvement in f-measure of around 12% compared to existing detection techniques with increased precision and recall.

Keywords: natural language processing, sarcasm recognition, tone-word spotting, pitch analyzer

Procedia PDF Downloads 170
26 A Word-to-Vector Formulation for Word Representation

Authors: Sandra Rizkallah, Amir F. Atiya

Abstract:

This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.

Keywords: Text Mining, natural language processing, text similarity, word to vector

Procedia PDF Downloads 73
25 Machine Learning-Based Workflow for the Analysis of Project Portfolio

Authors: Jean Marie Tshimula, Atsushi Togashi

Abstract:

We develop a data-science approach for providing an interactive visualization and predictive models to find insights into the projects' historical data in order for stakeholders understand some unseen opportunities in the African market that might escape them behind the online project portfolio of the African Development Bank. This machine learning-based web application identifies the market trend of the fastest growing economies across the continent as well skyrocketing sectors which have a significant impact on the future of business in Africa. Owing to this, the approach is tailored to predict where the investment needs are the most required. Moreover, we create a corpus that includes the descriptions of over more than 1,200 projects that approximately cover 14 sectors designed for some of 53 African countries. Then, we sift out this large amount of semi-structured data for extracting tiny details susceptible to contain some directions to follow. In the light of the foregoing, we have applied the combination of Latent Dirichlet Allocation and Random Forests at the level of the analysis module of our methodology to highlight the most relevant topics that investors may focus on for investing in Africa.

Keywords: Machine Learning, Big Data, natural language processing, topic modeling

Procedia PDF Downloads 61
24 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: natural language processing, Usability, qualitative data, text-processing algorithm

Procedia PDF Downloads 156
23 Knowledge Graph Development to Connect Earth Metadata and Standard English Queries

Authors: Gabriel Montague, Max Vilgalys, Catherine H. Crawford, Jorge Ortiz, Dava Newman

Abstract:

There has never been so much publicly accessible atmospheric and environmental data. The possibilities of these data are exciting, but the sheer volume of available datasets represents a new challenge for researchers. The task of identifying and working with a new dataset has become more difficult with the amount and variety of available data. Datasets are often documented in ways that differ substantially from the common English used to describe the same topics. This presents a barrier not only for new scientists, but for researchers looking to find comparisons across multiple datasets or specialists from other disciplines hoping to collaborate. This paper proposes a method for addressing this obstacle: creating a knowledge graph to bridge the gap between everyday English language and the technical language surrounding these datasets. Knowledge graph generation is already a well-established field, although there are some unique challenges posed by working with Earth data. One is the sheer size of the databases – it would be infeasible to replicate or analyze all the data stored by an organization like The National Aeronautics and Space Administration (NASA) or the European Space Agency. Instead, this approach identifies topics from metadata available for datasets in NASA’s Earthdata database, which can then be used to directly request and access the raw data from NASA. By starting with a single metadata standard, this paper establishes an approach that can be generalized to different databases, but leaves the challenge of metadata harmonization for future work. Topics generated from the metadata are then linked to topics from a collection of English queries through a variety of standard and custom natural language processing (NLP) methods. The results from this method are then compared to a baseline of elastic search applied to the metadata. This comparison shows the benefits of the proposed knowledge graph system over existing methods, particularly in interpreting natural language queries and interpreting topics in metadata. For the research community, this work introduces an application of NLP to the ecological and environmental sciences, expanding the possibilities of how machine learning can be applied in this discipline. But perhaps more importantly, it establishes the foundation for a platform that can enable common English to access knowledge that previously required considerable effort and experience. By making this public data accessible to the full public, this work has the potential to transform environmental understanding, engagement, and action.

Keywords: natural language processing, knowledge graphs, earth metadata, question-answer systems

Procedia PDF Downloads 30
22 Adaption Model for Building Agile Pronunciation Dictionaries Using Phonemic Distance Measurements

Authors: Akella Amarendra Babu, Rama Devi Yellasiri, Natukula Sainath

Abstract:

Where human beings can easily learn and adopt pronunciation variations, machines need training before put into use. Also humans keep minimum vocabulary and their pronunciation variations are stored in front-end of their memory for ready reference, while machines keep the entire pronunciation dictionary for ready reference. Supervised methods are used for preparation of pronunciation dictionaries which take large amounts of manual effort, cost, time and are not suitable for real time use. This paper presents an unsupervised adaptation model for building agile and dynamic pronunciation dictionaries online. These methods mimic human approach in learning the new pronunciations in real time. A new algorithm for measuring sound distances called Dynamic Phone Warping is presented and tested. Performance of the system is measured using an adaptation model and the precision metrics is found to be better than 86 percent.

Keywords: Machine Learning, natural language processing, Dynamic Programming, pronunciation variations

Procedia PDF Downloads 33
21 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: Medical informatics, Information Retrieval, natural language processing, unified medical language system, syntax based analysis

Procedia PDF Downloads 22
20 An Artificially Intelligent Teaching-Agent to Enhance Learning Interactions in Virtual Settings

Authors: Abdulwakeel B. Raji

Abstract:

This paper introduces a concept of an intelligent virtual learning environment that involves communication between learners and an artificially intelligent teaching agent in an attempt to replicate classroom learning interactions. The benefits of this technology over current e-learning practices is that it creates a virtual classroom where real time adaptive learning interactions are made possible. This is a move away from the static learning practices currently being adopted by e-learning systems. Over the years, artificial intelligence has been applied to various fields, including and not limited to medicine, military applications, psychology, marketing etc. The purpose of e-learning applications is to ensure users are able to learn outside of the classroom, but a major limitation has been the inability to fully replicate classroom interactions between teacher and students. This study used comparative surveys to gain information and understanding of the current learning practices in Nigerian universities and how they compare to these practices compare to the use of a developed e-learning system. The study was conducted by attending several lectures and noting the interactions between lecturers and tutors and as an aftermath, a software has been developed that deploys the use of an artificial intelligent teaching-agent alongside an e-learning system to enhance user learning experience and attempt to create the similar learning interactions to those found in classroom and lecture hall settings. Dialogflow has been used to implement a teaching-agent, which has been developed using JSON, which serves as a virtual teacher. Course content has been created using HTML, CSS, PHP and JAVASCRIPT as a web-based application. This technology can run on handheld devices and Google based home technologies to give learners an access to the teaching agent at any time. This technology also implements the use of definite clause grammars and natural language processing to match user inputs and requests with defined rules to replicate learning interactions. This technology developed covers familiar classroom scenarios such as answering users’ questions, asking ‘do you understand’ at regular intervals and answering subsequent requests, taking advanced user queries to give feedbacks at other periods. This software technology uses deep learning techniques to learn user interactions and patterns to subsequently enhance user learning experience. A system testing has been undergone by undergraduate students in the UK and Nigeria on the course ‘Introduction to Database Development’. Test results and feedback from users shows that this study and developed software is a significant improvement on existing e-learning systems. Further experiments are to be run using the software with different students and more course contents.

Keywords: Artificial Intelligence, Virtual learning, natural language processing, Deep learning, definite clause grammars

Procedia PDF Downloads 16
19 Methodology for Developing an Intelligent Tutoring System Based on Marzano’s Taxonomy

Authors: Joaquin Navarro Perales, Ana Lidia Franzoni Velázquez, Francisco Cervantes Pérez

Abstract:

The Mexican educational system faces diverse challenges related with the quality and coverage of education. The development of Intelligent Tutoring Systems (ITS) may help to solve some of them by helping teachers to customize their classes according to the performance of the students in online courses. In this work, we propose the adaptation of a functional ITS based on Bloom’s taxonomy called Sistema de Apoyo Generalizado para la Enseñanza Individualizada (SAGE), to measure student’s metacognition and their emotional response based on Marzano’s taxonomy. The students and the system will share the control over the advance in the course, so they can improve their metacognitive skills. The system will not allow students to get access to subjects not mastered yet. The interaction between the system and the student will be implemented through Natural Language Processing techniques, thus avoiding the use of sensors to evaluate student’s response. The teacher will evaluate student’s knowledge utilization, which is equivalent to the last cognitive level in Marzano’s taxonomy.

Keywords: natural language processing, metacognition, Intelligent Tutoring Systems, Affective Computing, student modelling

Procedia PDF Downloads 78
18 Composite Kernels for Public Emotion Recognition from Twitter

Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang

Abstract:

The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.

Keywords: Text Mining, natural language processing, Emotion recognition, sentiment analysis, composite kernel

Procedia PDF Downloads 28
17 Emotional Analysis for Text Search Queries on Internet

Authors: Gemma Garcia Lopez

Abstract:

The goal of this study is to analyze if search queries carried out in search engines such as Google, can offer emotional information about the user that performs them. Knowing the emotional state in which the Internet user is located can be a key to achieve the maximum personalization of content and the detection of worrying behaviors. For this, two studies were carried out using tools with advanced natural language processing techniques. The first study determines if a query can be classified as positive, negative or neutral, while the second study extracts emotional content from words and applies the categorical and dimensional models for the representation of emotions. In addition, we use search queries in Spanish and English to establish similarities and differences between two languages. The results revealed that text search queries performed by users on the Internet can be classified emotionally. This allows us to better understand the emotional state of the user at the time of the search, which could involve adapting the technology and personalizing the responses to different emotional states.

Keywords: natural language processing, emotion classification, text search queries, emotional analysis, sentiment analysis in text

Procedia PDF Downloads 30
16 Selecting Answers for Questions with Multiple Answer Choices in Arabic Question Answering Based on Textual Entailment Recognition

Authors: Anes Enakoa, Yawei Liang

Abstract:

Question Answering (QA) system is one of the most important and demanding tasks in the field of Natural Language Processing (NLP). In QA systems, the answer generation task generates a list of candidate answers to the user's question, in which only one answer is correct. Answer selection is one of the main components of the QA, which is concerned with selecting the best answer choice from the candidate answers suggested by the system. However, the selection process can be very challenging especially in Arabic due to its particularities. To address this challenge, an approach is proposed to answer questions with multiple answer choices for Arabic QA systems based on Textual Entailment (TE) recognition. The developed approach employs a Support Vector Machine that considers lexical, semantic and syntactic features in order to recognize the entailment between the generated hypotheses (H) and the text (T). A set of experiments has been conducted for performance evaluation and the overall performance of the proposed method reached an accuracy of 67.5% with [email protected] score of 80.46%. The obtained results are promising and demonstrate that the proposed method is effective for TE recognition task.

Keywords: Machine Learning, Information Retrieval, natural language processing, Question answering, textual entailment

Procedia PDF Downloads 11