Search results for: Natural language grammar models
4614 Models and Metamodels for Computer-Assisted Natural Language Grammar Learning
Authors: Evgeny Pyshkin, Maxim Mozgovoy, Vladislav Volkov
Abstract:
The paper follows a discourse on computer-assisted language learning. We examine problems of foreign language teaching and learning and introduce a metamodel that can be used to define learning models of language grammar structures in order to support teacher/student interaction. Special attention is paid to the concept of a virtual language lab. Our approach to language education assumes to encourage learners to experiment with a language and to learn by discovering patterns of grammatically correct structures created and managed by a language expert.
Keywords: Computer-assisted instruction, Language learning, Natural language grammar models, HCI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21914613 Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model
Authors: Selvam M, Natarajan. A M, Thangarajan R
Abstract:
Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.Keywords: Hybrid Language Model, Immediate Head Parsing, Lexicalized and Statistical Parsing, Natural Language Processing, Parts of Speech, Probabilistic Context Free Grammar, Tamil Language, Tree Bank.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36424612 Natural Language Database Interface for Selection of Data Using Grammar and Parsing
Authors: N. D. Karande, G. A. Patil
Abstract:
Databases have become ubiquitous. Almost all IT applications are storing into and retrieving information from databases. Retrieving information from the database requires knowledge of technical languages such as Structured Query Language (SQL). However majority of the users who interact with the databases do not have a technical background and are intimidated by the idea of using languages such as SQL. This has led to the development of a few Natural Language Database Interfaces (NLDBIs). A NLDBI allows the user to query the database in a natural language. This paper highlights on architecture of new NLDBI system, its implementation and discusses on results obtained. In most of the typical NLDBI systems the natural language statement is converted into an internal representation based on the syntactic and semantic knowledge of the natural language. This representation is then converted into queries using a representation converter. A natural language query is translated to an equivalent SQL query after processing through various stages. The work has been experimented on primitive database queries with certain constraints.
Keywords: Natural language database interface, representation converter, syntactic and semantic knowledge
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27044611 Redundancy in Malay Morphology: School Grammar versus Corpus Grammar
Authors: Zaharani Ahmad, Nor Hashimah Jalaluddin
Abstract:
The aim of this paper is to examine and identify the issue of linguistic redundancy in two competing grammars of Malay, namely the school grammar and the corpus grammar. The former is a normative grammar which is formally and prescriptively taught in the classroom, whereas the latter is a descriptive grammar that is informally acquired and mastered by the students as native speakers of the language outside the classroom. Corpus grammar is depicted based on its actual used in natural occurring texts, as attested in the corpus. It is observed that the grammar taught in schools is incompatible with the grammar used in the corpus. For instance, a noun phrase containing nominal reduplicated form which denotes plurality (i.e. murid-murid ‘students’ which is derived from murid ‘student’) and a modifier categorized as quantifiers (i.e. semua ‘all’, seluruh ‘entire’, and kebanyakan ‘most’) is not acceptable in the school grammar because the formation (i.e. semua murid-murid ‘all the students’ kebanyakan pelajar-pelajar ‘most of the students’) is claimed to be redundant, and redundancy is prohibited in the grammar. Redundancy is generally construed as the property of speech and language by which more information is provided than is precisely required for the message to be understood, so that, if some information is omitted, the remaining information will still be sufficient for the message to be comprehended. Thus, the correct construction to be used is strictly the reduplicated form (i.e. murid-murid ‘students’) or the quantifier plus the root (i.e. semua murid ‘all the students’) with the intention that the grammatical meaning of plural is not repeated. Nevertheless, the so-called redundant form (i.e. kebanyakan pelajar-pelajar ‘most of the students’) is frequently used in the corpus grammar. This study shows that there are a number of redundant forms occur in the morphology of the language, particularly in affixation, reduplication and combination of both. Apparently, the so-called redundancy has grammatical and socio-cultural functions in communication that is to give emphasis and to stress the importance of the information delivered by the speakers or writers.
Keywords: Corpus grammar, morphology, redundancy, school grammar.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17904610 JaCoText: A Pretrained Model for Java Code-Text Generation
Authors: Jessica Lòpez Espejel, Mahaman Sanoussi Yahaya Alassan, Walid Dahhane, El Hassane Ettifouri
Abstract:
Pretrained transformer-based models have shown high performance in natural language generation task. However, a new wave of interest has surged: automatic programming language generation. This task consists of translating natural language instructions to a programming code. Despite the fact that well-known pretrained models on language generation have achieved good performance in learning programming languages, effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformers neural network. It aims to generate java source code from natural language text. JaCoText leverages advantages of both natural language and code generation models. More specifically, we study some findings from the state of the art and use them to (1) initialize our model from powerful pretrained models, (2) explore additional pretraining on our java dataset, (3) carry out experiments combining the unimodal and bimodal data in the training, and (4) scale the input and output length during the fine-tuning of the model. Conducted experiments on CONCODE dataset show that JaCoText achieves new state-of-the-art results.
Keywords: Java code generation, Natural Language Processing, Sequence-to-sequence Models, Transformers Neural Networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8524609 Definition of a Computing Independent Model and Rules for Transformation Focused on the Model-View-Controller Architecture
Authors: Vanessa Matias Leite, Jandira Guenka Palma, Flávio Henrique de Oliveira
Abstract:
This paper presents a model-oriented development approach to software development in the Model-View-Controller (MVC) architectural standard. This approach aims to expose a process of extractions of information from the models, in which through rules and syntax defined in this work, assists in the design of the initial model and its future conversions. The proposed paper presents a syntax based on the natural language, according to the rules agreed in the classic grammar of the Portuguese language, added to the rules of conversions generating models that follow the norms of the Object Management Group (OMG) and the Meta-Object Facility MOF.
Keywords: Model driven architecture, model-view-controller, bnf syntax, model, transformation, UML.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9194608 PIELG: A Protein Interaction Extraction Systemusing a Link Grammar Parser from Biomedical Abstracts
Authors: Rania A. Abul Seoud, Nahed H. Solouma, Abou-Baker M. Youssef, Yasser M. Kadah
Abstract:
Due to the ever growing amount of publications about protein-protein interactions, information extraction from text is increasingly recognized as one of crucial technologies in bioinformatics. This paper presents a Protein Interaction Extraction System using a Link Grammar Parser from biomedical abstracts (PIELG). PIELG uses linkage given by the Link Grammar Parser to start a case based analysis of contents of various syntactic roles as well as their linguistically significant and meaningful combinations. The system uses phrasal-prepositional verbs patterns to overcome preposition combinations problems. The recall and precision are 74.4% and 62.65%, respectively. Experimental evaluations with two other state-of-the-art extraction systems indicate that PIELG system achieves better performance. For further evaluation, the system is augmented with a graphical package (Cytoscape) for extracting protein interaction information from sequence databases. The result shows that the performance is remarkably promising.Keywords: Link Grammar Parser, Interaction extraction, protein-protein interaction, Natural language processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22534607 The Relationship between Iranian EFL Learners' Multiple Intelligences and Their Performance on Grammar Tests
Authors: Rose Shayeghi, Pejman Hosseinioun
Abstract:
The Multiple Intelligences theory characterizes human intelligence as a multifaceted entity that exists in all human beings with varying degrees. The most important contribution of this theory to the field of English Language Teaching (ELT) is its role in identifying individual differences and designing more learnercentered programs. The present study aims at investigating the relationship between different elements of multiple intelligence and grammar scores. To this end, 63 female Iranian EFL learner selected from among intermediate students participated in the study. The instruments employed were a Nelson English language test, Michigan Grammar Test, and Teele Inventory for Multiple Intelligences (TIMI). The results of Pearson Product-Moment Correlation revealed a significant positive correlation between grammatical accuracy and linguistic as well as interpersonal intelligence. The results of Stepwise Multiple Regression indicated that linguistic intelligence contributed to the prediction of grammatical accuracy.Keywords: Multiple intelligence, grammar, ELT, EFL, TIMI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24184606 Age and Second Language Acquisition: A Case Study from Maldives
Authors: Aaidha Hammad
Abstract:
The age a child to be exposed to a second language is a controversial issue in communities such as the Maldives where English is taught as a second language. It has been observed that different stakeholders have different viewpoints towards the issue. Some believe that the earlier children are exposed to a second language, the better they learn, while others disagree with the notion. Hence, this case study investigates whether children learn a second language better when they are exposed at an earlier age or not. The spoken and written data collected confirm that earlier exposure helps in mastering the sound pattern and speaking fluency with more native-like accent, while a later age is better for learning more abstract and concrete aspects such as grammar and syntactic rules.Keywords: Age, development of language skills, fluency, second language acquisition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36364605 The Sign in the Communication Process
Authors: S. Pesina, T. Solonchak
Abstract:
In the process of information transmission (concept verbalization) we deal mostly with the substance (contents), and then pay attention to the form. Recalling events from the remote past, often we cannot exactly reproduce specific heard or pronounced words, as well as the syntactic structures. We remember events, feelings, images; we recall the general contents of the discourse. The thought gets a specific language form only during the concept verbalization phase. With minimum time for pondering, depending on the language competence level, the grammar and syntactic shaping often occurs automatically with the use of famous models and stereotypes. This means that the language form adapts itself to the consciousness, and not vice versa.
Keywords: Lexical eidos, phenomenology, noema, polysemantic word, semantic core.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19414604 A Thai to English Machine Translation System Using Thai LFG Tree Structure as Interlingua
Authors: Tawee Chimsuk, Surapong Auwatanamongkol
Abstract:
Machine Translation (MT) between the Thai and English languages has been a challenging research topic in natural language processing. Most research has been done on English to Thai machine translation, but not the other way around. This paper presents a Thai to English Machine Translation System that translates a Thai sentence into interlingua of a Thai LFG tree using LFG grammar and a bottom up parser. The Thai LFG tree is then transformed into the corresponding English LFG tree by pattern matching and node transformation. Finally, an equivalent English sentence is created using structural information prescribed by the English LFG tree. Based on results of experiments designed to evaluate the performance of the proposed system, it can be stated that the system has been proven to be effective in providing a useful translation from Thai to English.
Keywords: Interlingua, LFG grammar, Machine translation, Pattern matching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22944603 Role of Natural Language Processing in Information Retrieval; Challenges and Opportunities
Authors: Khaled M. Alhawiti
Abstract:
This paper aims to analyze the role of natural language processing (NLP). The paper will discuss the role in the context of automated data retrieval, automated question answer, and text structuring. NLP techniques are gaining wider acceptance in real life applications and industrial concerns. There are various complexities involved in processing the text of natural language that could satisfy the need of decision makers. This paper begins with the description of the qualities of NLP practices. The paper then focuses on the challenges in natural language processing. The paper also discusses major techniques of NLP. The last section describes opportunities and challenges for future research.
Keywords: Data Retrieval, Information retrieval, Natural Language Processing, Text Structuring.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28334602 Morpho-Phonological Modelling in Natural Language Processing
Authors: Eleni Galiotou, Angela Ralli
Abstract:
In this paper we propose a computational model for the representation and processing of morpho-phonological phenomena in a natural language, like Modern Greek. We aim at a unified treatment of inflection, compounding, and word-internal phonological changes, in a model that is used for both analysis and generation. After discussing certain difficulties cuase by well-known finitestate approaches, such as Koskenniemi-s two-level model [7] when applied to a computational treatment of compounding, we argue that a morphology-based model provides a more adequate account of word-internal phenomena. Contrary to the finite state approaches that cannot handle hierarchical word constituency in a satisfactory way, we propose a unification-based word grammar, as the nucleus of our strategy, which takes into consideration word representations that are based on affixation and [stem stem] or [stem word] compounds. In our formalism, feature-passing operations are formulated with the use of the unification device, and phonological rules modeling the correspondence between lexical and surface forms apply at morpheme boundaries. In the paper, examples from Modern Greek illustrate our approach. Morpheme structures, stress, and morphologically conditioned phoneme changes are analyzed and generated in a principled way.
Keywords: Morpho-Phonology, Natural Language Processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21284601 Native Language Identification with Cross-Corpus Evaluation Using Social Media Data: 'Reddit'
Authors: Yasmeen Bassas, Sandra Kuebler, Allen Riddell
Abstract:
Native Language Identification is one of the growing subfields in Natural Language Processing (NLP). The task of Native Language Identification (NLI) is mainly concerned with predicting the native language of an author’s writing in a second language. In this paper, we investigate the performance of two types of features; content-based features vs. content independent features when they are evaluated on a different corpus (using social media data “Reddit”). In this NLI task, the predefined models are trained on one corpus (TOEFL) and then the trained models are evaluated on a different data using an external corpus (Reddit). Three classifiers are used in this task; the baseline, linear SVM, and Logistic Regression. Results show that content-based features are more accurate and robust than content independent ones when tested within corpus and across corpus.
Keywords: NLI, NLP, content-based features, content independent features, social media corpus, ML.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4134600 Data Annotation Models and Annotation Query Language
Authors: Neerja Bhatnagar, Benjoe A. Juliano, Renee S. Renner
Abstract:
This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.Keywords: annotation query language, data annotations, data annotation models, semantic data annotations
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23534599 A NXM Version of 5X5 Playfair Cipher for any Natural Language (Urdu as Special Case)
Authors: Muhammad Salam, Nasir Rashid, Shah Khalid, Muhammad Raees Khan
Abstract:
In this paper a modified version NXM of traditional 5X5 playfair cipher is introduced which enable the user to encrypt message of any Natural language by taking appropriate size of the matrix depending upon the size of the natural language. 5X5 matrix has the capability of storing only 26 characters of English language and unable to store characters of any language having more than 26 characters. To overcome this limitation NXM matrix is introduced which solve this limitation. In this paper a special case of Urdu language is discussed. Where # is used for completing odd pair and * is used for repeating letters.
Keywords: cryptography, decryption, encryption, playfair cipher, traditional cipher.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21624598 A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment
Authors: Isaac K. E. Ampomah, Seong-Bae Park, Sang-Jo Lee
Abstract:
Over the past decade, there have been promising developments in Natural Language Processing (NLP) with several investigations of approaches focusing on Recognizing Textual Entailment (RTE). These models include models based on lexical similarities, models based on formal reasoning, and most recently deep neural models. In this paper, we present a sentence encoding model that exploits the sentence-to-sentence relation information for RTE. In terms of sentence modeling, Convolutional neural network (CNN) and recurrent neural networks (RNNs) adopt different approaches. RNNs are known to be well suited for sequence modeling, whilst CNN is suited for the extraction of n-gram features through the filters and can learn ranges of relations via the pooling mechanism. We combine the strength of RNN and CNN as stated above to present a unified model for the RTE task. Our model basically combines relation vectors computed from the phrasal representation of each sentence and final encoded sentence representations. Firstly, we pass each sentence through a convolutional layer to extract a sequence of higher-level phrase representation for each sentence from which the first relation vector is computed. Secondly, the phrasal representation of each sentence from the convolutional layer is fed into a Bidirectional Long Short Term Memory (Bi-LSTM) to obtain the final sentence representations from which a second relation vector is computed. The relations vectors are combined and then used in then used in the same fashion as attention mechanism over the Bi-LSTM outputs to yield the final sentence representations for the classification. Experiment on the Stanford Natural Language Inference (SNLI) corpus suggests that this is a promising technique for RTE.Keywords: Deep neural models, natural language inference, recognizing textual entailment, sentence-to-sentence relation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14514597 3D Multi-User Virtual Environment in Language Teaching
Authors: Hana Maresova, Daniel Ecler, Miroslava Mensikova
Abstract:
This article focuses on the use of 3D multi-user virtual environment in language teaching and presents the results of a four-year research at the Palacky University Olomouc Faculty of Education (Czech Republic). Language teaching was conducted in an experimental form in the 3D virtual worlds of Second Life and Kitely (experimental group) and, in parallel to this, there was also traditional teaching conducted on identical topics in the form of lectures using a textbook (control group). The didactic test, which was presented to both of the groups in an identical form before the start of teaching and after its implementation, verified the effect of teaching in the experimental group by comparing the achieved results of both groups. Out of the three components of mother tongue teaching (grammar, literature, composition and communication education) students achieved partial better results (in the case of points focused on the visualization of the subject matter, these were statistically significant) in literature. Students from the control group performed better in grammar and composition. Based on the achieved results, we can state that the most appropriate use of multi-user virtual environment (MUVE) can be seen in teaching those topics that have the possibility of dramatization, experiential learning and group cooperation.
Keywords: 3D virtual reality, multiuser environments, online education, language education.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4724596 AnQL: A Query Language for Annotation Documents
Authors: Neerja Bhatnagar, Ben A. Juliano, Renee S. Renner
Abstract:
This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.
Keywords: Annotation query language, data annotations, data annotation models, semantic data annotations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18394595 Context Detection in Spreadsheets Based on Automatically Inferred Table Schema
Authors: Alexander Wachtel, Michael T. Franzen, Walter F. Tichy
Abstract:
Programming requires years of training. With natural language and end user development methods, programming could become available to everyone. It enables end users to program their own devices and extend the functionality of the existing system without any knowledge of programming languages. In this paper, we describe an Interactive Spreadsheet Processing Module (ISPM), a natural language interface to spreadsheets that allows users to address ranges within the spreadsheet based on inferred table schema. Using the ISPM, end users are able to search for values in the schema of the table and to address the data in spreadsheets implicitly. Furthermore, it enables them to select and sort the spreadsheet data by using natural language. ISPM uses a machine learning technique to automatically infer areas within a spreadsheet, including different kinds of headers and data ranges. Since ranges can be identified from natural language queries, the end users can query the data using natural language. During the evaluation 12 undergraduate students were asked to perform operations (sum, sort, group and select) using the system and also Excel without ISPM interface, and the time taken for task completion was compared across the two systems. Only for the selection task did users take less time in Excel (since they directly selected the cells using the mouse) than in ISPM, by using natural language for end user software engineering, to overcome the present bottleneck of professional developers.Keywords: Natural language processing, end user development; natural language interfaces, human computer interaction, data recognition, dialog systems, spreadsheet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11224594 A Computer Model of Language Acquisition – Syllable Learning – Based on Hebbian Cell Assemblies and Reinforcement Learning
Authors: Sepideh Fazeli, Fariba Bahrami
Abstract:
Investigating language acquisition is one of the most challenging problems in the area of studying language. Syllable learning as a level of language acquisition has a considerable significance since it plays an important role in language acquisition. Because of impossibility of studying language acquisition directly with children, especially in its developmental phases, computer models will be useful in examining language acquisition. In this paper a computer model of early language learning for syllable learning is proposed. It is guided by a conceptual model of syllable learning which is named Directions Into Velocities of Articulators model (DIVA). The computer model uses simple associational and reinforcement learning rules within neural network architecture which are inspired by neuroscience. Our simulation results verify the ability of the proposed computer model in producing phonemes during babbling and early speech. Also, it provides a framework for examining the neural basis of language learning and communication disorders.Keywords: Brain modeling, computer models, language acquisition, reinforcement learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15894593 Syntactic Recognition of Distorted Patterns
Authors: Marek Skomorowski
Abstract:
In syntactic pattern recognition a pattern can be represented by a graph. Given an unknown pattern represented by a graph g, the problem of recognition is to determine if the graph g belongs to a language L(G) generated by a graph grammar G. The so-called IE graphs have been defined in [1] for a description of patterns. The IE graphs are generated by so-called ETPL(k) graph grammars defined in [1]. An efficient, parsing algorithm for ETPL(k) graph grammars for syntactic recognition of patterns represented by IE graphs has been presented in [1]. In practice, structural descriptions may contain pattern distortions, so that the assignment of a graph g, representing an unknown pattern, to a graph language L(G) generated by an ETPL(k) graph grammar G is rejected by the ETPL(k) type parsing. Therefore, there is a need for constructing effective parsing algorithms for recognition of distorted patterns. The purpose of this paper is to present a new approach to syntactic recognition of distorted patterns. To take into account all variations of a distorted pattern under study, a probabilistic description of the pattern is needed. A random IE graph approach is proposed here for such a description ([2]).Keywords: Syntactic pattern recognition, Distorted patterns, Random graphs, Graph grammars.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13924592 A Survey of the Applications of Sentiment Analysis
Authors: Pingping Lin, Xudong Luo
Abstract:
Natural language often conveys emotions of speakers. Therefore, sentiment analysis on what people say is prevalent in the field of natural language process and has great application value in many practical problems. Thus, to help people understand its application value, in this paper, we survey various applications of sentiment analysis, including the ones in online business and offline business as well as other types of its applications. In particular, we give some application examples in intelligent customer service systems in China. Besides, we compare the applications of sentiment analysis on Twitter, Weibo, Taobao and Facebook, and discuss some challenges. Finally, we point out the challenges faced in the applications of sentiment analysis and the work that is worth being studied in the future.Keywords: Natural language processing, sentiment analysis, application, online comments.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9514591 Teaching English to Engineers: Between English Language Teaching and Psychology
Authors: Irina-Ana Drobot
Abstract:
Teaching English to Engineers is part of English for Specific Purposes, a domain which is under the attention of English students especially under the current conditions of finding jobs and establishing partnerships outside Romania. The paper will analyse the existing textbooks together with the teaching strategies they adopt. Teaching English to Engineering students can intersect with domains such as psychology and cultural studies in order to teach them efficiently. Textbooks for students of ESP, ranging from those at the Faculty of Economics to those at the Faculty of Engineers, have shifted away from using specialized vocabulary, drills for grammar and reading comprehension questions and toward communicative methods and the practical use of language. At present, in Romania, grammar is neglected in favour of communicative methods. The current interest in translation studies may indicate a return to this type of method, since only translation specialists can distinguish among specialized terms and determine which are most suitable in a translation. Engineers are currently encouraged to learn English in order to do their own translations in their own field. This paper will analyse the issue of the extent to which it is useful to teach Engineering students to do translations in their field using cognitive psychology applied to language teaching, including issues such as motivation and social psychology. Teaching general English to engineering students can result in lack of interest, but they can be motivated by practical aspects which will help them in their field. This is why this paper needs to take into account an interdisciplinary approach to teaching English to Engineers.
Keywords: Cognition, ESP, motivation, psychology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31234590 Aspect Oriented Software Architecture
Authors: Pradip Peter Dey, Ronald F. Gonzales, Gordon W. Romney, Mohammad Amin, Bhaskar Raj Sinha
Abstract:
Natural language processing systems pose a unique challenge for software architectural design as system complexity has increased continually and systems cannot be easily constructed from loosely coupled modules. Lexical, syntactic, semantic, and pragmatic aspects of linguistic information are tightly coupled in a manner that requires separation of concerns in a special way in design, implementation and maintenance. An aspect oriented software architecture is proposed in this paper after critically reviewing relevant architectural issues. For the purpose of this paper, the syntactic aspect is characterized by an augmented context-free grammar. The semantic aspect is composed of multiple perspectives including denotational, operational, axiomatic and case frame approaches. Case frame semantics matured in India from deep thematic analysis. It is argued that lexical, syntactic, semantic and pragmatic aspects work together in a mutually dependent way and their synergy is best represented in the aspect oriented approach. The software architecture is presented with an augmented Unified Modeling Language.Keywords: Language engineering, parsing, software design, user experience.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17424589 Estimation of Natural Frequency of the Bearing System under Periodic Force Based on Principal of Hydrodynamic Mass of Fluid
Authors: M. H. Pol, A. Bidi, A. V. Hoseini
Abstract:
Estimation of natural frequency of structures is very important and isn-t usually calculated simply and sometimes complicated. Lack of knowledge about that caused hard damage and hazardous effects. In this paper, with using from two different models in FEM method and based on hydrodynamic mass of fluids, natural frequency of an especial bearing (Fig. 1) in an electric field (or, a periodic force) is calculated in different stiffness and different geometric. In final, the results of two models and analytical solution are compared.Keywords: Natural frequency of the bearing, Hydrodynamic mass of fluid method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26454588 Tibyan Automated Arabic Correction Using Machine-Learning in Detecting Syntactical Mistakes
Authors: Ashwag O. Maghraby, Nida N. Khan, Hosnia A. Ahmed, Ghufran N. Brohi, Hind F. Assouli, Jawaher S. Melibari
Abstract:
The Arabic language is one of the most important languages. Learning it is so important for many people around the world because of its religious and economic importance and the real challenge lies in practicing it without grammatical or syntactical mistakes. This research focused on detecting and correcting the syntactic mistakes of Arabic syntax according to their position in the sentence and focused on two of the main syntactical rules in Arabic: Dual and Plural. It analyzes each sentence in the text, using Stanford CoreNLP morphological analyzer and machine-learning approach in order to detect the syntactical mistakes and then correct it. A prototype of the proposed system was implemented and evaluated. It uses support vector machine (SVM) algorithm to detect Arabic grammatical errors and correct them using the rule-based approach. The prototype system has a far accuracy 81%. In general, it shows a set of useful grammatical suggestions that the user may forget about while writing due to lack of familiarity with grammar or as a result of the speed of writing such as alerting the user when using a plural term to indicate one person.
Keywords: Arabic Language acquisition and learning, natural language processing, morphological analyzer, part-of-speech.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10434587 On the Relationship between Language Output and Second Language Acquisition
Authors: Haiyan Wang
Abstract:
Many researchers have been discussing the importance of language input in second language acquisition. The author holds that the bigger problem lies in how to activate language learners' language knowledge and raise their language output consciousness and competence. Analyzing the importance of language output based on theory and reality, this paper mainly explores the essence of language output and its revelation for second language acquisition in order to make second language learners really raise their communicative competence.
Keywords: Language output, second language acquisition, communicative competence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37024586 Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models
Authors: Danielle Shackley, Yetunde Folajimi
Abstract:
As more people turn to the internet seeking health related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores of text, ranging from positive, neutral and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing, tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process, and substituting the Naive Bayes for a deep learning neural network model.
Keywords: Sentiment analysis, Naive Bayes model, natural language processing, topic analysis, fake health news classification model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4854585 On Dialogue Systems Based on Deep Learning
Authors: Yifan Fan, Xudong Luo, Pingping Lin
Abstract:
Nowadays, dialogue systems increasingly become the way for humans to access many computer systems. So, humans can interact with computers in natural language. A dialogue system consists of three parts: understanding what humans say in natural language, managing dialogue, and generating responses in natural language. In this paper, we survey deep learning based methods for dialogue management, response generation and dialogue evaluation. Specifically, these methods are based on neural network, long short-term memory network, deep reinforcement learning, pre-training and generative adversarial network. We compare these methods and point out the further research directions.Keywords: Dialogue management, response generation, reinforcement learning, deep learning, evaluation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 786