Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9884

Search results for: natural language processing

9884 Role of Natural Language Processing in Information Retrieval; Challenges and Opportunities

Authors: Khaled M. Alhawiti


This paper aims to analyze the role of natural language processing (NLP). The paper will discuss the role in the context of automated data retrieval, automated question answer, and text structuring. NLP techniques are gaining wider acceptance in real life applications and industrial concerns. There are various complexities involved in processing the text of natural language that could satisfy the need of decision makers. This paper begins with the description of the qualities of NLP practices. The paper then focuses on the challenges in natural language processing. The paper also discusses major techniques of NLP. The last section describes opportunities and challenges for future research.

Keywords: data retrieval, information retrieval, natural language processing, text structuring

Procedia PDF Downloads 266
9883 Impact of Natural Language Processing in Educational Setting: An Effective Approach towards Improved Learning

Authors: Khaled M. Alhawiti


Natural Language Processing (NLP) is an effective approach for bringing improvement in educational setting. This involves initiating the process of learning through the natural acquisition in the educational systems. It is based on following effective approaches for providing the solution for various problems and issues in education. Natural Language Processing provides solution in a variety of different fields associated with the social and cultural context of language learning. It is based on involving various tools and techniques such as grammar, syntax, and structure of text. It is effective approach for teachers, students, authors, and educators for providing assistance for writing, analysis, and assessment procedure. Natural Language Processing is widely integrated in the large number of educational contexts such as research, science, linguistics, e-learning, evaluations system, and various other educational settings such as schools, higher education system, and universities. Natural Language Processing is based on applying scientific approach in the educational settings. In the educational settings, NLP is an effective approach to ensure that students can learn easily in the same way as they acquired language in the natural settings.

Keywords: natural language processing, education, application, e-learning, scientific studies, educational system

Procedia PDF Downloads 424
9882 Natural Language Processing; the Future of Clinical Record Management

Authors: Khaled M. Alhawiti


This paper investigates the future of medicine and the use of Natural language processing. The importance of having correct clinical information available online is remarkable; improving patient care at affordable costs could be achieved using automated applications to use the online clinical information. The major challenge towards the retrieval of such vital information is to have it appropriately coded. Majority of the online patient reports are not found to be coded and not accessible as its recorded in natural language text. The use of Natural Language processing provides a feasible solution by retrieving and organizing clinical information, available in text and transforming clinical data that is available for use. Systems used in NLP are rather complex to construct, as they entail considerable knowledge, however significant development has been made. Newly formed NLP systems have been tested and have established performance that is promising and considered as practical clinical applications.

Keywords: clinical information, information retrieval, natural language processing, automated applications

Procedia PDF Downloads 317
9881 Resource Creation Using Natural Language Processing Techniques for Malay Translated Qur'an

Authors: Nor Diana Ahmad, Eric Atwell, Brandon Bennett


Text processing techniques for English have been developed for several decades. But for the Malay language, text processing methods are still far behind. Moreover, there are limited resources, tools for computational linguistic analysis available for the Malay language. Therefore, this research presents the use of natural language processing (NLP) in processing Malay translated Qur’an text. As the result, a new language resource for Malay translated Qur’an was created. This resource will help other researchers to build the necessary processing tools for the Malay language. This research also develops a simple question-answer prototype to demonstrate the use of the Malay Qur’an resource for text processing. This prototype has been developed using Python. The prototype pre-processes the Malay Qur’an and an input query using a stemming algorithm and then searches for occurrences of the query word stem. The result produced shows improved matching likelihood between user query and its answer. A POS-tagging algorithm has also been produced. The stemming and tagging algorithms can be used as tools for research related to other Malay texts and can be used to support applications such as information retrieval, question answering systems, ontology-based search and other text analysis tasks.

Keywords: language resource, Malay translated Qur'an, natural language processing (NLP), text processing

Procedia PDF Downloads 223
9880 Resume Ranking Using Custom Word2vec and Rule-Based Natural Language Processing Techniques

Authors: Subodh Chandra Shakya, Rajendra Sapkota, Aakash Tamang, Shushant Pudasaini, Sujan Adhikari, Sajjan Adhikari


Lots of efforts have been made in order to measure the semantic similarity between the text corpora in the documents. Techniques have been evolved to measure the similarity of two documents. One such state-of-art technique in the field of Natural Language Processing (NLP) is word to vector models, which converts the words into their word-embedding and measures the similarity between the vectors. We found this to be quite useful for the task of resume ranking. So, this research paper is the implementation of the word2vec model along with other Natural Language Processing techniques in order to rank the resumes for the particular job description so as to automate the process of hiring. The research paper proposes the system and the findings that were made during the process of building the system.

Keywords: chunking, document similarity, information extraction, natural language processing, word2vec, word embedding

Procedia PDF Downloads 60
9879 An Analysis of Lexical and Grammatical Gender Bias in German Bidirectional Encoder Representations from Transformers Networks

Authors: Freya Thießen, Johannes Schrumpf


Gender bias in natural language processing neural networks based on the Transformer architecture has been the focus of recent research. So far, primarily language models trained on the English language has been investigated and found to possess biased representations with regard to gender. Linguistic analysis hints at the possibility that due to semantic and grammatical differences between the German and English languages, BERT networks trained on German-language material may possess different gender bias properties than English BERT networks. This study investigates the impact of lexical and grammatical forms of gender information on bias in German-BERT, a BERT network trained for natural language processing of the German language. Through an analysis of the principal components of German-BERT embeddings, we show that gender bias exists in German-BERT in the presence of grammatical gender information and lexical gender stereotypes.

Keywords: artificial intelligence, ethical machine learning, gender bias, German language-specific bias, natural language processing

Procedia PDF Downloads 24
9878 Gender Bias in Natural Language Processing: Machines Reflect Misogyny in Society

Authors: Irene Yi


Machine learning, natural language processing, and neural network models of language are becoming more and more prevalent in the fields of technology and linguistics today. Training data for machines are at best, large corpora of human literature and at worst, a reflection of the ugliness in society. Machines have been trained on millions of human books, only to find that in the course of human history, derogatory and sexist adjectives are used significantly more frequently when describing females in history and literature than when describing males. This is extremely problematic, both as training data, and as the outcome of natural language processing. As machines start to handle more responsibilities, it is crucial to ensure that they do not take with them historical sexist and misogynistic notions. This paper gathers data and algorithms from neural network models of language having to deal with syntax, semantics, sociolinguistics, and text classification. Results are significant in showing the existing intentional and unintentional misogynistic notions used to train machines, as well as in developing better technologies that take into account the semantics and syntax of text to be more mindful and reflect gender equality. Further, this paper deals with the idea of non-binary gender pronouns and how machines can process these pronouns correctly, given its semantic and syntactic context. This paper also delves into the implications of gendered grammar and its effect, cross-linguistically, on natural language processing. Languages such as French or Spanish not only have rigid gendered grammar rules, but also historically patriarchal societies. The progression of society comes hand in hand with not only its language, but how machines process those natural languages. These ideas are all extremely vital to the development of natural language models in technology, and they must be taken into account immediately.

Keywords: gendered grammar, misogynistic language, natural language processing, neural networks

Procedia PDF Downloads 46
9877 Context Detection in Spreadsheets Based on Automatically Inferred Table Schema

Authors: Alexander Wachtel, Michael T. Franzen, Walter F. Tichy


Programming requires years of training. With natural language and end user development methods, programming could become available to everyone. It enables end users to program their own devices and extend the functionality of the existing system without any knowledge of programming languages. In this paper, we describe an Interactive Spreadsheet Processing Module (ISPM), a natural language interface to spreadsheets that allows users to address ranges within the spreadsheet based on inferred table schema. Using the ISPM, end users are able to search for values in the schema of the table and to address the data in spreadsheets implicitly. Furthermore, it enables them to select and sort the spreadsheet data by using natural language. ISPM uses a machine learning technique to automatically infer areas within a spreadsheet, including different kinds of headers and data ranges. Since ranges can be identified from natural language queries, the end users can query the data using natural language. During the evaluation 12 undergraduate students were asked to perform operations (sum, sort, group and select) using the system and also Excel without ISPM interface, and the time taken for task completion was compared across the two systems. Only for the selection task did users take less time in Excel (since they directly selected the cells using the mouse) than in ISPM, by using natural language for end user software engineering, to overcome the present bottleneck of professional developers.

Keywords: natural language processing, natural language interfaces, human computer interaction, end user development, dialog systems, data recognition, spreadsheet

Procedia PDF Downloads 236
9876 Application of Natural Language Processing in Education

Authors: Khaled M. Alhawiti


Reading capability is a major segment of language competency. On the other hand, discovering topical writings at a fitting level for outside and second language learners is a test for educators. We address this issue utilizing natural language preparing innovation to survey reading level and streamline content. In the connection of outside and second-language learning, existing measures of reading level are not appropriate to this errand. Related work has demonstrated the profit of utilizing measurable language preparing procedures; we expand these thoughts and incorporate other potential peculiarities to measure intelligibility. In the first piece of this examination, we join characteristics from measurable language models, customary reading level measures and other language preparing apparatuses to deliver a finer technique for recognizing reading level. We examine the execution of human annotators and assess results for our finders concerning human appraisals. A key commitment is that our identifiers are trainable; with preparing and test information from the same space, our finders beat more general reading level instruments (Flesch-Kincaid and Lexile). Trainability will permit execution to be tuned to address the needs of specific gatherings or understudies.

Keywords: natural language processing, trainability, syntactic simplification tools, education

Procedia PDF Downloads 422
9875 Genomic Sequence Representation Learning: An Analysis of K-Mer Vector Embedding Dimensionality

Authors: James Jr. Mashiyane, Risuna Nkolele, Stephanie J. Müller, Gciniwe S. Dlamini, Rebone L. Meraba, Darlington S. Mapiye


When performing language tasks in natural language processing (NLP), the dimensionality of word embeddings is chosen either ad-hoc or is calculated by optimizing the Pairwise Inner Product (PIP) loss. The PIP loss is a metric that measures the dissimilarity between word embeddings, and it is obtained through matrix perturbation theory by utilizing the unitary invariance of word embeddings. Unlike in natural language, in genomics, especially in genome sequence processing, unlike in natural language processing, there is no notion of a “word,” but rather, there are sequence substrings of length k called k-mers. K-mers sizes matter, and they vary depending on the goal of the task at hand. The dimensionality of word embeddings in NLP has been studied using the matrix perturbation theory and the PIP loss. In this paper, the sufficiency and reliability of applying word-embedding algorithms to various genomic sequence datasets are investigated to understand the relationship between the k-mer size and their embedding dimension. This is completed by studying the scaling capability of three embedding algorithms, namely Latent Semantic analysis (LSA), Word2Vec, and Global Vectors (GloVe), with respect to the k-mer size. Utilising the PIP loss as a metric to train embeddings on different datasets, we also show that Word2Vec outperforms LSA and GloVe in accurate computing embeddings as both the k-mer size and vocabulary increase. Finally, the shortcomings of natural language processing embedding algorithms in performing genomic tasks are discussed.

Keywords: word embeddings, k-mer embedding, dimensionality reduction

Procedia PDF Downloads 34
9874 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay


Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 296
9873 Online Learning Versus Face to Face Learning: A Sentiment Analysis on General Education Mathematics in the Modern World of University of San Carlos School of Arts and Sciences Students Using Natural Language Processing

Authors: Derek Brandon G. Yu, Clyde Vincent O. Pilapil, Christine F. Peña


College students of Cebu province have been indoors since March 2020, and a challenge encountered is the sudden shift from face to face to online learning and with the lack of empirical data on online learning on Higher Education Institutions (HEIs) in the Philippines. Sentiments on face to face and online learning will be collected from University of San Carlos (USC), School of Arts and Sciences (SAS) students regarding Mathematics in the Modern World (MMW), a General Education (GE) course. Natural Language Processing with machine learning algorithms will be used to classify the sentiments of the students. Results of the research study are the themes identified through topic modelling and the overall sentiments of the students in USC SAS

Keywords: natural language processing, online learning, sentiment analysis, topic modelling

Procedia PDF Downloads 101
9872 A Newspapers Expectations Indicator from Web Scraping

Authors: Pilar Rey del Castillo


This document describes the building of an average indicator of the general sentiments about the future exposed in the newspapers in Spain. The raw data are collected through the scraping of the Digital Periodical and Newspaper Library website. Basic tools of natural language processing are later applied to the collected information to evaluate the sentiment strength of each word in the texts using a polarized dictionary. The last step consists of summarizing these sentiments to produce daily indices. The results are a first insight into the applicability of these techniques to produce periodic sentiment indicators.

Keywords: natural language processing, periodic indicator, sentiment analysis, web scraping

Procedia PDF Downloads 56
9871 A Controlled Natural Language Assisted Approach for the Design and Automated Processing of Service Level Agreements

Authors: Christopher Schwarz, Katrin Riegler, Erwin Zinser


The management of outsourcing relationships between IT service providers and their customers proofs to be a critical issue that has to be stipulated by means of Service Level Agreements (SLAs). Since service requirements differ from customer to customer, SLA content and language structures vary largely, standardized SLA templates may not be used and an automated processing of SLA content is not possible. Hence, SLA management is usually a time-consuming and inefficient manual process. For overcoming these challenges, this paper presents an innovative and ITIL V3-conform approach for automated SLA design and management using controlled natural language in enterprise collaboration portals. The proposed novel concept is based on a self-developed controlled natural language that follows a subject-predicate-object approach to specify well-defined SLA content structures that act as templates for customized contracts and support automated SLA processing. The derived results eventually enable IT service providers to automate several SLA request, approval and negotiation processes by means of workflows and business rules within an enterprise collaboration portal. The illustrated prototypical realization gives evidence of the practical relevance in service-oriented scenarios as well as the high flexibility and adaptability of the presented model. Thus, the prototype enables the automated creation of well defined, customized SLA documents, providing a knowledge representation that is both human understandable and machine processable.

Keywords: automated processing, controlled natural language, knowledge representation, information technology outsourcing, service level management

Procedia PDF Downloads 355
9870 Morphology of Cartographic Words: A Perspective from Chinese Characters

Authors: Xinyu Gong, Zhilin Li, Xintao Liu


Maps are a means of communication. Cartographic language involves established theories of natural language for understanding maps. “Cartographic words’, or “map symbols”, are crucial elements of cartographic language. Personalized mapping is increasingly popular, with growing demands for customized map-making by the general public. Automated symbol-making and customization play a key role in personalized mapping. However, formal representations for the automated construction of map symbols are still lacking. In natural language, the process of word and sentence construction can be formalized. Through the analogy between natural language and graphical language, formal representations of natural language construction can be used as a reference for constructing cartographic language. We selected Chinese character structures (i.e., S

Keywords: personalized mapping, Chinese character, cartographic language, map symbols

Procedia PDF Downloads 78
9869 A Survey of the Applications of Sentiment Analysis

Authors: Pingping Lin, Xudong Luo


Natural language often conveys emotions of speakers. Therefore, sentiment analysis on what people say is prevalent in the field of natural language process and has great application value in many practical problems. Thus, to help people understand its application value, in this paper, we survey various applications of sentiment analysis, including the ones in online business and offline business as well as other types of its applications. In particular, we give some application examples in intelligent customer service systems in China. Besides, we compare the applications of sentiment analysis on Twitter, Weibo, Taobao and Facebook, and discuss some challenges. Finally, we point out the challenges faced in the applications of sentiment analysis and the work that is worth being studied in the future.

Keywords: application, natural language processing, online comments, sentiment analysis

Procedia PDF Downloads 77
9868 Models and Metamodels for Computer-Assisted Natural Language Grammar Learning

Authors: Evgeny Pyshkin, Maxim Mozgovoy, Vladislav Volkov


The paper follows a discourse on computer-assisted language learning. We examine problems of foreign language teaching and learning and introduce a metamodel that can be used to define learning models of language grammar structures in order to support teacher/student interaction. Special attention is paid to the concept of a virtual language lab. Our approach to language education assumes to encourage learners to experiment with a language and to learn by discovering patterns of grammatically correct structures created and managed by a language expert.

Keywords: computer-assisted instruction, language learning, natural language grammar models, HCI

Procedia PDF Downloads 432
9867 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik


Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: hidden markov model, natural language processing, POS tagging, viterbi algorithm

Procedia PDF Downloads 267
9866 JaCoText: A Pretrained Model for Java Code-Text Generation

Authors: Jessica Lopez Espejel, Mahaman Sanoussi Yahaya Alassan, Walid Dahhane, El Hassane Ettifouri


Pretrained transformer-based models have shown high performance in natural language generation tasks. However, a new wave of interest has emerged: automatic programming language code generation. This task consists of translating natural language instruction into a source code. Despite the fact that well-known pre-trained models on language generation have achieved good performance in learning programming languages, the effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformer neural network. It aims to generate java source code from natural language text. JaCoText leverages the advantages of both natural language and code generation models. More specifically, we study some findings from state of the art and use them to (1) initialize our model from powerful pre-trained models, (2) explore additional pretraining on our java dataset, (3) lead experiments combining the unimodal and bimodal data in training, and (4) scale the input and output length during the fine-tuning of the model. Conducted experiments on the CONCODE dataset show that JaCoText achieves new state-of-the-art results.

Keywords: java code generation, natural language processing, sequence-to-sequence models, transformer neural networks

Procedia PDF Downloads 11
9865 Resource Framework Descriptors for Interestingness in Data

Authors: C. B. Abhilash, Kavi Mahesh


Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.

Keywords: RDF, interestingness, knowledge base, semantic data

Procedia PDF Downloads 77
9864 Implementing a Database from a Requirement Specification

Authors: M. Omer, D. Wilson


Creating a database scheme is essentially a manual process. From a requirement specification, the information contained within has to be analyzed and reduced into a set of tables, attributes and relationships. This is a time-consuming process that has to go through several stages before an acceptable database schema is achieved. The purpose of this paper is to implement a Natural Language Processing (NLP) based tool to produce a from a requirement specification. The Stanford CoreNLP version 3.3.1 and the Java programming were used to implement the proposed model. The outcome of this study indicates that the first draft of a relational database schema can be extracted from a requirement specification by using NLP tools and techniques with minimum user intervention. Therefore, this method is a step forward in finding a solution that requires little or no user intervention.

Keywords: information extraction, natural language processing, relation extraction

Procedia PDF Downloads 170
9863 From User's Requirements to UML Class Diagram

Authors: Zeineb Ben Azzouz, Wahiba Ben Abdessalem Karaa


The automated extraction of UML class diagram from natural language requirements is a highly challenging task. Many approaches, frameworks and tools have been presented in this field. Nonetheless, the experiments of these tools have shown that there is no approach that can work best all the time. In this context, we propose a new accurate approach to facilitate the automatic mapping from textual requirements to UML class diagram. Our new approach integrates the best properties of statistical Natural Language Processing (NLP) techniques to reduce ambiguity when analysing natural language requirements text. In addition, our approach follows the best practices defined by conceptual modelling experts to determine some patterns indispensable for the extraction of basic elements and concepts of the class diagram. Once the relevant information of class diagram is captured, a XMI document is generated and imported with a CASE tool to build the corresponding UML class diagram.

Keywords: class diagram, user’s requirements, XMI, software engineering

Procedia PDF Downloads 398
9862 Computational Linguistic Implications of Gender Bias: Machines Reflect Misogyny in Society

Authors: Irene Yi


Machine learning, natural language processing, and neural network models of language are becoming more and more prevalent in the fields of technology and linguistics today. Training data for machines are at best, large corpora of human literature and at worst, a reflection of the ugliness in society. Computational linguistics is a growing field dealing with such issues of data collection for technological development. Machines have been trained on millions of human books, only to find that in the course of human history, derogatory and sexist adjectives are used significantly more frequently when describing females in history and literature than when describing males. This is extremely problematic, both as training data, and as the outcome of natural language processing. As machines start to handle more responsibilities, it is crucial to ensure that they do not take with them historical sexist and misogynistic notions. This paper gathers data and algorithms from neural network models of language having to deal with syntax, semantics, sociolinguistics, and text classification. Computational analysis on such linguistic data is used to find patterns of misogyny. Results are significant in showing the existing intentional and unintentional misogynistic notions used to train machines, as well as in developing better technologies that take into account the semantics and syntax of text to be more mindful and reflect gender equality. Further, this paper deals with the idea of non-binary gender pronouns and how machines can process these pronouns correctly, given its semantic and syntactic context. This paper also delves into the implications of gendered grammar and its effect, cross-linguistically, on natural language processing. Languages such as French or Spanish not only have rigid gendered grammar rules, but also historically patriarchal societies. The progression of society comes hand in hand with not only its language, but how machines process those natural languages. These ideas are all extremely vital to the development of natural language models in technology, and they must be taken into account immediately.

Keywords: computational analysis, gendered grammar, misogynistic language, neural networks

Procedia PDF Downloads 39
9861 A Supervised Approach for Word Sense Disambiguation Based on Arabic Diacritics

Authors: Alaa Alrakaf, Sk. Md. Mizanur Rahman


Since the last two decades’ Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate a word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text then found the most accurate sense of an ambiguous word using Naïve Bayes Classifier. Our Experimental study proves that using Arabic Diacritics with Naïve Bayes Classifier enhances the accuracy of choosing the appropriate sense by 23% and also decreases the ambiguity in machine translation.

Keywords: Arabic natural language processing, machine learning, machine translation, Naive bayes classifier, word sense disambiguation

Procedia PDF Downloads 278
9860 Controlling Drone Flight Missions through Natural Language Processors Using Artificial Intelligence

Authors: Sylvester Akpah, Selasi Vondee


Unmanned Aerial Vehicles (UAV) as they are also known, drones have attracted increasing attention in recent years due to their ubiquitous nature and boundless applications in the areas of communication, surveying, aerial photography, weather forecasting, medical delivery, surveillance amongst others. Operated remotely in real-time or pre-programmed, drones can fly autonomously or on pre-defined routes. The application of these aerial vehicles has successfully penetrated the world due to technological evolution, thus a lot more businesses are utilizing their capabilities. Unfortunately, while drones are replete with the benefits stated supra, they are riddled with some problems, mainly attributed to the complexities in learning how to master drone flights, collision avoidance and enterprise security. Additional challenges, such as the analysis of flight data recorded by sensors attached to the drone may take time and require expert help to analyse and understand. This paper presents an autonomous drone control system using a chatbot. The system allows for easy control of drones using conversations with the aid of Natural Language Processing, thus to reduce the workload needed to set up, deploy, control, and monitor drone flight missions. The results obtained at the end of the study revealed that the drone connected to the chatbot was able to initiate flight missions with just text and voice commands, enable conversation and give real-time feedback from data and requests made to the chatbot. The results further revealed that the system was able to process natural language and produced human-like conversational abilities using Artificial Intelligence (Natural Language Understanding). It is recommended that radio signal adapters be used instead of wireless connections thus to increase the range of communication with the aerial vehicle.

Keywords: artificial ntelligence, chatbot, natural language processing, unmanned aerial vehicle

Procedia PDF Downloads 71
9859 Learning Grammars for Detection of Disaster-Related Micro Events

Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev


Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.

Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter

Procedia PDF Downloads 407
9858 Corpus-Based Neural Machine Translation: Empirical Study Multilingual Corpus for Machine Translation of Opaque Idioms - Cloud AutoML Platform

Authors: Khadija Refouh


Culture bound-expressions have been a bottleneck for Natural Language Processing (NLP) and comprehension, especially in the case of machine translation (MT). In the last decade, the field of machine translation has greatly advanced. Neural machine translation NMT has recently achieved considerable development in the quality of translation that outperformed previous traditional translation systems in many language pairs. Neural machine translation NMT is an Artificial Intelligence AI and deep neural networks applied to language processing. Despite this development, there remain some serious challenges that face neural machine translation NMT when translating culture bounded-expressions, especially for low resources language pairs such as Arabic-English and Arabic-French, which is not the case with well-established language pairs such as English-French. Machine translation of opaque idioms from English into French are likely to be more accurate than translating them from English into Arabic. For example, Google Translate Application translated the sentence “What a bad weather! It runs cats and dogs.” to “يا له من طقس سيء! تمطر القطط والكلاب” into the target language Arabic which is an inaccurate literal translation. The translation of the same sentence into the target language French was “Quel mauvais temps! Il pleut des cordes.” where Google Translate Application used the accurate French corresponding idioms. This paper aims to perform NMT experiments towards better translation of opaque idioms using high quality clean multilingual corpus. This Corpus will be collected analytically from human generated idiom translation. AutoML translation, a Google Neural Machine Translation Platform, is used as a custom translation model to improve the translation of opaque idioms. The automatic evaluation of the custom model will be compared to the Google NMT using Bilingual Evaluation Understudy Score BLEU. BLEU is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Human evaluation is integrated to test the reliability of the Blue Score. The researcher will examine syntactical, lexical, and semantic features using Halliday's functional theory.

Keywords: multilingual corpora, natural language processing (NLP), neural machine translation (NMT), opaque idioms

Procedia PDF Downloads 52
9857 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini


Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 69
9856 Composite Kernels for Public Emotion Recognition from Twitter

Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang


The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.

Keywords: emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining

Procedia PDF Downloads 106
9855 A Comparative Analysis of Hyper-Parameters Using Neural Networks for E-Mail Spam Detection

Authors: Syed Mahbubuz Zaman, A. B. M. Abrar Haque, Mehedi Hassan Nayeem, Misbah Uddin Sagor


Everyday e-mails are being used by millions of people as an effective form of communication over the Internet. Although e-mails allow high-speed communication, there is a constant threat known as spam. Spam e-mail is often called junk e-mails which are unsolicited and sent in bulk. These unsolicited emails cause security concerns among internet users because they are being exposed to inappropriate content. There is no guaranteed way to stop spammers who use static filters as they are bypassed very easily. In this paper, a smart system is proposed that will be using neural networks to approach spam in a different way, and meanwhile, this will also detect the most relevant features that will help to design the spam filter. Also, a comparison of different parameters for different neural network models has been shown to determine which model works best within suitable parameters.

Keywords: long short-term memory, bidirectional long short-term memory, gated recurrent unit, natural language processing, natural language processing

Procedia PDF Downloads 121