Search results for: natural language grammar models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 14547

Search results for: natural language grammar models

14457 A Model for Teaching Arabic Grammar in Light of the Common European Framework of Reference for Languages

Authors: Erfan Abdeldaim Mohamed Ahmed Abdalla

Abstract:

The complexity of Arabic grammar poses challenges for learners, particularly in relation to its arrangement, classification, abundance, and bifurcation. The challenge at hand is a result of the contextual factors that gave rise to the grammatical rules in question, as well as the pedagogical approach employed at the time, which was tailored to the needs of learners during that particular historical period. Consequently, modern-day students encounter this same obstacle. This requires a thorough examination of the arrangement and categorization of Arabic grammatical rules based on particular criteria, as well as an assessment of their objectives. Additionally, it is necessary to identify the prevalent and renowned grammatical rules, as well as those that are infrequently encountered, obscure and disregarded. This paper presents a compilation of grammatical rules that require arrangement and categorization in accordance with the standards outlined in the Common European Framework of Reference for Languages (CEFR). In addition to facilitating comprehension of the curriculum, accommodating learners' requirements, and establishing the fundamental competencies for achieving proficiency in Arabic, it is imperative to ascertain the conventions that language learners necessitate in alignment with explicitly delineated benchmarks such as the CEFR criteria. The aim of this study is to reduce the quantity of grammatical rules that are typically presented to non-native Arabic speakers in Arabic textbooks. This reduction is expected to enhance the motivation of learners to continue their Arabic language acquisition and to approach the level of proficiency of native speakers. The primary obstacle faced by learners is the intricate nature of Arabic grammar, which poses a significant challenge in the realm of study. The proliferation and complexity of regulations evident in Arabic language textbooks designed for individuals who are not native speakers is noteworthy. The inadequate organisation and delivery of the material create the impression that the grammar is being imparted to a student with the intention of memorising "Alfiyyat-Ibn-Malik." Consequently, the sequence of grammatical rules instruction was altered, with rules originally intended for later instruction being presented first and those intended for earlier instruction being presented subsequently. Students often focus on learning grammatical rules that are not necessarily required while neglecting the rules that are commonly used in everyday speech and writing. Non-Arab students are taught Arabic grammar chapters that are infrequently utilised in Arabic literature and may be a topic of debate among grammarians. The aforementioned findings are derived from the statistical analysis and investigations conducted by the researcher, which will be disclosed in due course of the research. To instruct non-Arabic speakers on grammatical rules, it is imperative to discern the most prevalent grammatical frameworks in grammar manuals and linguistic literature (study sample). The present proposal suggests the allocation of grammatical structures across linguistic levels, taking into account the guidelines of the CEFR, as well as the grammatical structures that are necessary for non-Arabic-speaking learners to generate a modern, cohesive, and comprehensible language.

Keywords: grammar, Arabic, functional, framework, problems, standards, statistical, popularity, analysis

Procedia PDF Downloads 58
14456 A Survey of the Applications of Sentiment Analysis

Authors: Pingping Lin, Xudong Luo

Abstract:

Natural language often conveys emotions of speakers. Therefore, sentiment analysis on what people say is prevalent in the field of natural language process and has great application value in many practical problems. Thus, to help people understand its application value, in this paper, we survey various applications of sentiment analysis, including the ones in online business and offline business as well as other types of its applications. In particular, we give some application examples in intelligent customer service systems in China. Besides, we compare the applications of sentiment analysis on Twitter, Weibo, Taobao and Facebook, and discuss some challenges. Finally, we point out the challenges faced in the applications of sentiment analysis and the work that is worth being studied in the future.

Keywords: application, natural language processing, online comments, sentiment analysis

Procedia PDF Downloads 226
14455 Multilingualism without a Dominant Language in the Preschool Age: A Case of Natural Italian-Russian-German-English Multilingualism

Authors: Legkikh Victoria

Abstract:

The purpose of keeping bi/multilingualism is usually a way to let the child speak two/three languages at the same level. The main problem which normally appears is a mixed language or a domination of one language. The same level of two or more languages would be ideal but practically not easily reachable. So it was made an experiment with a girl with a natural multilingualism as an attempt to avoid a dominant language in the preschool age. The girl lives in Germany and the main languages for her are Italian, Russian and German but she also hears every day English. ‘One parent – one language’ strategy was used since the beginning so Italian and Russian were spoken to her since her birth, English was spoken between the parents and when she was 1,5 it was added German as a language of a nursery. In order to avoid a dominant language, she was always put in international groups with activity in different languages. Even if it was not possible to avoid an interference of languages in this case we can talk not only about natural multilingualism but also about balanced bilingualism in preschool time. The languages have been developing in parallel with different accents in a different period. Now at the age of 6 we can see natural horizontal multilingualism Russian/Italian/German/English. At the moment, her Russian/Italian bilingualism is balanced. German vocabulary is less but the language is active and English is receptive. We can also see a reciprocal interference of all the three languages (English is receptive so the simple phrases are normally said correctly but they are not enough to judge the level of language interference and it is not noticed any ‘English’ mistakes in other languages). After analysis of the state of every language, we can see as a positive and negative result of the experiment. As a positive result we can see that in the age of 6 the girl does not refuse any language, three languages are active, she differentiate languages and even if she says a word from another language she notifies that it is not a correct word, and the most important are the fact, that she does not have a preferred language. As a prove of the last statement it is to be noticed not only her self-identification as ‘half Russian and half Italian’ but also an answer to the question about her ‘mother tongue’: ‘I do not know, probably, when I have my own children I will speak one day Russian and one day Italian and sometimes German’. As a negative result, we can notice that not only a development of all the three languages are a little bit slower than it is supposed for her age but since she does not have a dominating language she also does not have a ‘perfect’ language and the interference is reciprocal. In any case, the experiment shows that it is possible to keep at least two languages without a preference in a pre-school multilingual space.

Keywords: balanced bilingualism, language interference, natural multilingualism, preschool multilingual education

Procedia PDF Downloads 249
14454 Digitalisation of the Railway Industry: Recent Advances in the Field of Dialogue Systems: Systematic Review

Authors: Andrei Nosov

Abstract:

This paper discusses the development directions of dialogue systems within the digitalisation of the railway industry, where technologies based on conversational AI are already potentially applied or will be applied. Conversational AI is one of the popular natural language processing (NLP) tasks, as it has great prospects for real-world applications today. At the same time, it is a challenging task as it involves many areas of NLP based on complex computations and deep insights from linguistics and psychology. In this review, we focus on dialogue systems and their implementation in the railway domain. We comprehensively review the state-of-the-art research results on dialogue systems and analyse them from three perspectives: type of problem to be solved, type of model, and type of system. In particular, from the perspective of the type of tasks to be solved, we discuss characteristics and applications. This will help to understand how to prioritise tasks. In terms of the type of models, we give an overview that will allow researchers to become familiar with how to apply them in dialogue systems. By analysing the types of dialogue systems, we propose an unconventional approach in contrast to colleagues who traditionally contrast goal-oriented dialogue systems with open-domain systems. Our view focuses on considering retrieval and generative approaches. Furthermore, the work comprehensively presents evaluation methods and datasets for dialogue systems in the railway domain to pave the way for future research. Finally, some possible directions for future research are identified based on recent research results.

Keywords: digitalisation, railway, dialogue systems, conversational AI, natural language processing, natural language understanding, natural language generation

Procedia PDF Downloads 30
14453 Audio-Lingual Method and the English-Speaking Proficiency of Grade 11 Students

Authors: Marthadale Acibo Semacio

Abstract:

Speaking skill is a crucial part of English language teaching and learning. This actually shows the great importance of this skill in English language classes. Through speaking, ideas and thoughts are shared with other people, and a smooth interaction between people takes place. The study examined the levels of speaking proficiency of the control and experimental groups on pronunciation, grammatical accuracy, and fluency. As a quasi-experimental study, it also determined the presence or absence of significant changes in their speaking proficiency levels in terms of pronouncing the words correctly, the accuracy of grammar and fluency of a language given the two methods to the groups of students in the English language, using the traditional and audio-lingual methods. Descriptive and inferential statistics were employed according to the stated specific problems. The study employed a video presentation with prior information about it. In the video, the teacher acts as model one, giving instructions on what is going to be done, and then the students will perform the activity. The students were paired purposively based on their learning capabilities. Observing proper ethics, their performance was audio recorded to help the researcher assess the learner using the modified speaking rubric. The study revealed that those under the traditional method were more fluent than those in the audio-lingual method. With respect to the way in which each method deals with the feelings of the student, the audio-lingual one fails to provide a principle that would relate to this area and follows the assumption that the intrinsic motivation of the students to learn the target language will spring from their interest in the structure of the language. However, the speaking proficiency levels of the students were remarkably reinforced in reading different words through the aid of aural media with their teachers. The study concluded that using an audio-lingual method of teaching is not a stand-alone method but only an aid of the teacher in helping the students improve their speaking proficiency in the English Language. Hence, audio-lingual approach is encouraged to be used in teaching English language, on top of the chalk-talk or traditional method, to improve the speaking proficiency of students.

Keywords: audio-lingual, speaking, grammar, pronunciation, accuracy, fluency, proficiency

Procedia PDF Downloads 38
14452 Deep-Learning to Generation of Weights for Image Captioning Using Part-of-Speech Approach

Authors: Tiago do Carmo Nogueira, Cássio Dener Noronha Vinhal, Gélson da Cruz Júnior, Matheus Rudolfo Diedrich Ullmann

Abstract:

Generating automatic image descriptions through natural language is a challenging task. Image captioning is a task that consistently describes an image by combining computer vision and natural language processing techniques. To accomplish this task, cutting-edge models use encoder-decoder structures. Thus, Convolutional Neural Networks (CNN) are used to extract the characteristics of the images, and Recurrent Neural Networks (RNN) generate the descriptive sentences of the images. However, cutting-edge approaches still suffer from problems of generating incorrect captions and accumulating errors in the decoders. To solve this problem, we propose a model based on the encoder-decoder structure, introducing a module that generates the weights according to the importance of the word to form the sentence, using the part-of-speech (PoS). Thus, the results demonstrate that our model surpasses state-of-the-art models.

Keywords: gated recurrent units, caption generation, convolutional neural network, part-of-speech

Procedia PDF Downloads 67
14451 Leveraging Natural Language Processing for Legal Artificial Intelligence: A Longformer Approach for Taiwanese Legal Cases

Authors: Hsin Lee, Hsuan Lee

Abstract:

Legal artificial intelligence (LegalAI) has been increasing applications within legal systems, propelled by advancements in natural language processing (NLP). Compared with general documents, legal case documents are typically long text sequences with intrinsic logical structures. Most existing language models have difficulty understanding the long-distance dependencies between different structures. Another unique challenge is that while the Judiciary of Taiwan has released legal judgments from various levels of courts over the years, there remains a significant obstacle in the lack of labeled datasets. This deficiency makes it difficult to train models with strong generalization capabilities, as well as accurately evaluate model performance. To date, models in Taiwan have yet to be specifically trained on judgment data. Given these challenges, this research proposes a Longformer-based pre-trained language model explicitly devised for retrieving similar judgments in Taiwanese legal documents. This model is trained on a self-constructed dataset, which this research has independently labeled to measure judgment similarities, thereby addressing a void left by the lack of an existing labeled dataset for Taiwanese judgments. This research adopts strategies such as early stopping and gradient clipping to prevent overfitting and manage gradient explosion, respectively, thereby enhancing the model's performance. The model in this research is evaluated using both the dataset and the Average Entropy of Offense-charged Clustering (AEOC) metric, which utilizes the notion of similar case scenarios within the same type of legal cases. Our experimental results illustrate our model's significant advancements in handling similarity comparisons within extensive legal judgments. By enabling more efficient retrieval and analysis of legal case documents, our model holds the potential to facilitate legal research, aid legal decision-making, and contribute to the further development of LegalAI in Taiwan.

Keywords: legal artificial intelligence, computation and language, language model, Taiwanese legal cases

Procedia PDF Downloads 43
14450 Teaching English to Engineers: Between English Language Teaching and Psychology

Authors: Irina-Ana Drobot

Abstract:

Teaching English to Engineers is part of English for Specific Purposes, a domain which is under the attention of English students especially under the current conditions of finding jobs and establishing partnerships outside Romania. The paper will analyse the existing textbooks together with the teaching strategies they adopt. Teaching English to Engineering students can intersect with domains such as psychology and cultural studies in order to teach them efficiently. Textbooks for students of ESP, ranging from those at the Faculty of Economics to those at the Faculty of Engineers, have shifted away from using specialized vocabulary, drills for grammar and reading comprehension questions and toward communicative methods and the practical use of language. At present, in Romania, grammar is neglected in favour of communicative methods. The current interest in translation studies may indicate a return to this type of method, since only translation specialists can distinguish among specialized terms and determine which are most suitable in a translation. Engineers are currently encouraged to learn English in order to do their own translations in their own field. This paper will analyse the issue of the extent to which it is useful to teach Engineering students to do translations in their field using cognitive psychology applied to language teaching, including issues such as motivation and social psychology. Teaching general English to engineering students can result in lack of interest, but they can be motivated by practical aspects which will help them in their field. This is why this paper needs to take into account an interdisciplinary approach to teaching English to Engineers.

Keywords: cognition, ESP, motivation, psychology

Procedia PDF Downloads 234
14449 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 168
14448 The Effect of Written Corrective Feedback on the Accurate Use of Grammatical Forms by Japanese Low-Intermediate EFL Learners

Authors: Ayako Hasegawa, Ken Ubukata

Abstract:

The purpose of this study is to investigate whether corrective feedback has any significant effect on Japanese low-intermediate EFL learners’ performance on a specific set of linguistic features. The subjects are Japanese college students majoring in English. They have studied English for about 7 years, but their inter-language seems to fossilize because non-target like errors is frequently observed in traditional deductive teacher-fronted approach. It has been reported that corrective feedback plays an important role in diminishing or overcoming inter-language fossilization and achieving TL competency. Therefore, it was examined how the corrective feedback (the focus of this study was metalinguistic feedback) and self-correction raised the students’ awareness and helped them notice the gaps between their inter-language and the TL.

Keywords: written corrective feedback, fossilized error, grammar teaching, language teaching

Procedia PDF Downloads 330
14447 Creating Energy Sustainability in an Enterprise

Authors: John Lamb, Robert Epstein, Vasundhara L. Bhupathi, Sanjeev Kumar Marimekala

Abstract:

As we enter the new era of Artificial Intelligence (AI) and Cloud Computing, we mostly rely on the Machine and Natural Language Processing capabilities of AI, and Energy Efficient Hardware and Software Devices in almost every industry sector. In these industry sectors, much emphasis is on developing new and innovative methods for producing and conserving energy and sustaining the depletion of natural resources. The core pillars of sustainability are economic, environmental, and social, which is also informally referred to as the 3 P's (People, Planet and Profits). The 3 P's play a vital role in creating a core Sustainability Model in the Enterprise. Natural resources are continually being depleted, so there is more focus and growing demand for renewable energy. With this growing demand, there is also a growing concern in many industries on how to reduce carbon emissions and conserve natural resources while adopting sustainability in corporate business models and policies. In our paper, we would like to discuss the driving forces such as Climate changes, Natural Disasters, Pandemic, Disruptive Technologies, Corporate Policies, Scaled Business Models and Emerging social media and AI platforms that influence the 3 main pillars of Sustainability (3P’s). Through this paper, we would like to bring an overall perspective on enterprise strategies and the primary focus on bringing cultural shifts in adapting energy-efficient operational models. Overall, many industries across the globe are incorporating core sustainability principles such as reducing energy costs, reducing greenhouse gas (GHG) emissions, reducing waste and increasing recycling, adopting advanced monitoring and metering infrastructure, reducing server footprint and compute resources (Shared IT services, Cloud computing, and Application Modernization) with the vision for a sustainable environment.

Keywords: climate change, pandemic, disruptive technology, government policies, business model, machine learning and natural language processing, AI, social media platform, cloud computing, advanced monitoring, metering infrastructure

Procedia PDF Downloads 79
14446 Learning to Translate by Learning to Communicate to an Entailment Classifier

Authors: Szymon Rutkowski, Tomasz Korbak

Abstract:

We present a reinforcement-learning-based method of training neural machine translation models without parallel corpora. The standard encoder-decoder approach to machine translation suffers from two problems we aim to address. First, it needs parallel corpora, which are scarce, especially for low-resource languages. Second, it lacks psychological plausibility of learning procedure: learning a foreign language is about learning to communicate useful information, not merely learning to transduce from one language’s 'encoding' to another. We instead pose the problem of learning to translate as learning a policy in a communication game between two agents: the translator and the classifier. The classifier is trained beforehand on a natural language inference task (determining the entailment relation between a premise and a hypothesis) in the target language. The translator produces a sequence of actions that correspond to generating translations of both the hypothesis and premise, which are then passed to the classifier. The translator is rewarded for classifier’s performance on determining entailment between sentences translated by the translator to disciple’s native language. Translator’s performance thus reflects its ability to communicate useful information to the classifier. In effect, we train a machine translation model without the need for parallel corpora altogether. While similar reinforcement learning formulations for zero-shot translation were proposed before, there is a number of improvements we introduce. While prior research aimed at grounding the translation task in the physical world by evaluating agents on an image captioning task, we found that using a linguistic task is more sample-efficient. Natural language inference (also known as recognizing textual entailment) captures semantic properties of sentence pairs that are poorly correlated with semantic similarity, thus enforcing basic understanding of the role played by compositionality. It has been shown that models trained recognizing textual entailment produce high-quality general-purpose sentence embeddings transferrable to other tasks. We use stanford natural language inference (SNLI) dataset as well as its analogous datasets for French (XNLI) and Polish (CDSCorpus). Textual entailment corpora can be obtained relatively easily for any language, which makes our approach more extensible to low-resource languages than traditional approaches based on parallel corpora. We evaluated a number of reinforcement learning algorithms (including policy gradients and actor-critic) to solve the problem of translator’s policy optimization and found that our attempts yield some promising improvements over previous approaches to reinforcement-learning based zero-shot machine translation.

Keywords: agent-based language learning, low-resource translation, natural language inference, neural machine translation, reinforcement learning

Procedia PDF Downloads 95
14445 Building an Opinion Dynamics Model from Experimental Data

Authors: Dino Carpentras, Paul J. Maher, Caoimhe O'Reilly, Michael Quayle

Abstract:

Opinion dynamics is a sub-field of agent-based modeling that focuses on people’s opinions and their evolutions over time. Despite the rapid increase in the number of publications in this field, it is still not clear how to apply these models to real-world scenarios. Indeed, there is no agreement on how people update their opinion while interacting. Furthermore, it is not clear if different topics will show the same dynamics (e.g., more polarized topics may behave differently). These problems are mostly due to the lack of experimental validation of the models. Some previous studies started bridging this gap in the literature by directly measuring people’s opinions before and after the interaction. However, these experiments force people to express their opinion as a number instead of using natural language (and then, eventually, encoding it as numbers). This is not the way people normally interact, and it may strongly alter the measured dynamics. Another limitation of these studies is that they usually average all the topics together, without checking if different topics may show different dynamics. In our work, we collected data from 200 participants on 5 unpolarized topics. Participants expressed their opinions in natural language (“agree” or “disagree”). We also measured the certainty of their answer, expressed as a number between 1 and 10. However, this value was not shown to other participants to keep the interaction based on natural language. We then showed the opinion (and not the certainty) of another participant and, after a distraction task, we repeated the measurement. To make the data compatible with opinion dynamics models, we multiplied opinion and certainty to obtain a new parameter (here called “continuous opinion”) ranging from -10 to +10 (using agree=1 and disagree=-1). We firstly checked the 5 topics individually, finding that all of them behaved in a similar way despite having different initial opinions distributions. This suggested that the same model could be applied for different unpolarized topics. We also observed that people tend to maintain similar levels of certainty, even when they changed their opinion. This is a strong violation of what is suggested from common models, where people starting at, for example, +8, will first move towards 0 instead of directly jumping to -8. We also observed social influence, meaning that people exposed with “agree” were more likely to move to higher levels of continuous opinion, while people exposed with “disagree” were more likely to move to lower levels. However, we also observed that the effect of influence was smaller than the effect of random fluctuations. Also, this configuration is different from standard models, where noise, when present, is usually much smaller than the effect of social influence. Starting from this, we built an opinion dynamics model that explains more than 80% of data variance. This model was also able to show the natural conversion of polarization from unpolarized states. This experimental approach offers a new way to build models grounded on experimental data. Furthermore, the model offers new insight into the fundamental terms of opinion dynamics models.

Keywords: experimental validation, micro-dynamics rule, opinion dynamics, update rule

Procedia PDF Downloads 80
14444 Transitivity System in Research Journal Articles

Authors: Noni Agustina, Nuryansyah Adijaya

Abstract:

Writing research report plays an important role in a process of conducting research, especially a research report which is written in English. A researcher should consider many language elements; grammar, word-appropriateness, punctuation, etc in a research report. However, many researchers face some problems in research report, especially for non-native writers. This study is aimed to find out the characteristics of internationally published research journal articles based on functional grammar viewpoint especially transitivity system. Six published research journal articles which consist of English Language Teaching, linguistics, and medical fields were takes as the data. Each of field comprises native and non-native English speaking research journal articles. Qualitative content analysis was employed as the method of the study The results show that all six published research journal articles both native and non-native use material and relational process. The participants are dominated by goal, phenomenon, attribute, value, verbiage, and existent. They reflect the objectivity in research journal articles. Moreover, circumstance of place and quality occur more frequently. Transitivity system that consists of process types, participants, and circumstances have roles in describing the characteristics of research journal articles.

Keywords: transitivity system, SFL, ideational meaning, research journal article

Procedia PDF Downloads 256
14443 Leveraging Unannotated Data to Improve Question Answering for French Contract Analysis

Authors: Touila Ahmed, Elie Louis, Hamza Gharbi

Abstract:

State of the art question answering models have recently shown impressive performance especially in a zero-shot setting. This approach is particularly useful when confronted with a highly diverse domain such as the legal field, in which it is increasingly difficult to have a dataset covering every notion and concept. In this work, we propose a flexible generative question answering approach to contract analysis as well as a weakly supervised procedure to leverage unannotated data and boost our models’ performance in general, and their zero-shot performance in particular.

Keywords: question answering, contract analysis, zero-shot, natural language processing, generative models, self-supervision

Procedia PDF Downloads 142
14442 Tibyan Automated Arabic Correction Using Machine-Learning in Detecting Syntactical Mistakes

Authors: Ashwag O. Maghraby, Nida N. Khan, Hosnia A. Ahmed, Ghufran N. Brohi, Hind F. Assouli, Jawaher S. Melibari

Abstract:

The Arabic language is one of the most important languages. Learning it is so important for many people around the world because of its religious and economic importance and the real challenge lies in practicing it without grammatical or syntactical mistakes. This research focused on detecting and correcting the syntactic mistakes of Arabic syntax according to their position in the sentence and focused on two of the main syntactical rules in Arabic: Dual and Plural. It analyzes each sentence in the text, using Stanford CoreNLP morphological analyzer and machine-learning approach in order to detect the syntactical mistakes and then correct it. A prototype of the proposed system was implemented and evaluated. It uses support vector machine (SVM) algorithm to detect Arabic grammatical errors and correct them using the rule-based approach. The prototype system has a far accuracy 81%. In general, it shows a set of useful grammatical suggestions that the user may forget about while writing due to lack of familiarity with grammar or as a result of the speed of writing such as alerting the user when using a plural term to indicate one person.

Keywords: Arabic language acquisition and learning, natural language processing, morphological analyzer, part-of-speech

Procedia PDF Downloads 121
14441 Literacy in First and Second Language: Implication for Language Education

Authors: Inuwa Danladi Bawa

Abstract:

One of the challenges of African states in the development of education in the past and the present is the problem of literacy. Literacy in the first language is seen as a strong base for the development of second language; they are mostly the language of education. Language development is an offshoot of language planning; so the need to develop literacy in both first and second language affects language education and predicts the extent of achievement of the entire education sector. The need to balance literacy acquisition in first language for good conditioning the acquisition of second language is paramount. Likely constraints that includes; non-standardization, underdeveloped and undeveloped first languages are among many. Solutions to some of these include the development of materials and use of the stages and levels of literacy acquisition. This is with believed that a child writes well in second language if he has literacy in the first language.

Keywords: first language, second language, literacy, english language, linguistics

Procedia PDF Downloads 404
14440 Development of a French to Yorùbá Machine Translation System

Authors: Benjamen Nathaniel, Eludiora Safiriyu Ijiyemi, Egume Oneme Lucky

Abstract:

A review on machine translation systems shows that a lot of computational artefacts has been carried out to translate written or spoken texts from a source language to Yorùbá language through Machine Translation systems. However, there are no work on French to Yorùbá language machine translation system; hence, the study investigated the process involved in the translation of French-to-Yorùbá language equivalent with the view to adopting a rule- based MT approach to build a Machine Translation framework from simple sentences administered through questionnaire. Articles and relevant textbooks were reviewed with key speakers of both languages interviewed to find out the processes involved in the translation of French language and their equivalent in Yorùbálanguage simple sentences using home domain terminologies. Achieving this, a model was formulated using phrase grammar structure, re-write rule, parse tree, automata theory- based techniques, designed and implemented respectively with unified modeling language (UML) and python programming language. Analysing the result, it was observed when carrying out the result that, the Machine Translation system performed 18.45% above Experimental Subject Respondent and 2.7% below Linguistics Expert when analysed with word orthography, sentence syntax and semantic correctness of the sentences. And, when compared with Google Machine Translation system, it was noticed that the developed system performed better on lexicons of the target language.

Keywords: machine translation (MT), rule-based, French language, Yoru`ba´ language

Procedia PDF Downloads 16
14439 Identifying Children at Risk for Specific Language Impairment Using a Wordless Picture Narrative: A Study on Hindi, an Indian Language

Authors: Yozna Gurung

Abstract:

This paper presents preliminary findings from an on-going study on the use of Internal State Terms (IST) in the production of narratives of Hindi-English bilinguals in an attempt to identify children at risk for Specific Language Impairment. Narratives were examined for macrostructure (story structure and story complexity) and internal state terms or mental state terms (IST/MST). 31 students generated stories based on six pictures that were matched for content and story structure in L1 (Hindi) and L2 (English) using a wordless picture narrative. From 30 sample population, 2 students are at risk of Specific Language Impairment, according to this study i.e 6.45%. They showed least development in story grammar as well as IST in both their languages.

Keywords: internal state terms, macrostructure, specific language impairment, wordless picture narrative

Procedia PDF Downloads 201
14438 Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models

Authors: Danielle Shackley, Yetunde Folajimi

Abstract:

As more people turn to the internet seeking health-related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores to text, ranging from positive, neutral, and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing and tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial, and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced, and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process and substituting the Naive Bayes for a deep learning neural network model.

Keywords: sentiment analysis, Naive Bayes model, natural language processing, topic analysis, fake health news classification model

Procedia PDF Downloads 66
14437 Enquiry Based Approaches to Teaching Grammar and Differentiation in the Senior Japanese Classroom

Authors: Julie Devine

Abstract:

This presentation will look at the approaches to teaching grammar taken over two years with students studying Japanese in the last two years of high school. The main focus is an enquiry based approach to grammar introduction and a three tier system using videos and online support material to allow for differentiation and personalised learning in the classroom. The aim is to create space for motivated students to do some higher order activities using the target pattern to solve problems and create scenarios. Less motivated students have time to complete basic exercises and struggling students have some time with the teacher in smaller groups.

Keywords: differentiation, digital technologies, personalised learning plans, student engagement

Procedia PDF Downloads 135
14436 On Dialogue Systems Based on Deep Learning

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

Nowadays, dialogue systems increasingly become the way for humans to access many computer systems. So, humans can interact with computers in natural language. A dialogue system consists of three parts: understanding what humans say in natural language, managing dialogue, and generating responses in natural language. In this paper, we survey deep learning based methods for dialogue management, response generation and dialogue evaluation. Specifically, these methods are based on neural network, long short-term memory network, deep reinforcement learning, pre-training and generative adversarial network. We compare these methods and point out the further research directions.

Keywords: dialogue management, response generation, deep learning, evaluation

Procedia PDF Downloads 131
14435 Intercultural Communication in the Teaching of English as a Foreign Language in Malawi

Authors: Peter Mayeso Jiyajiya

Abstract:

This paper discusses how the teaching of English as a foreign language in Malawi can enhance intercultural communication competence in a multicultural society. It argues that incorporation of intercultural communication in the teaching of English as a foreign language would improve cultural awareness in communication in the multicultural Malawi. The teaching of English in Malawi is geared towards producing students who would communicate in the global world. This entails the use of proper pedagogical approaches and instructional materials that prepare the students toward intercultural awareness. In view of this, the language teachers were interviewed in order to determine their instructional approaches to intercultural communication. Instructional materials were further evaluated to assess how interculturality is incorporated. The study found out that teachers face perceptual and technical challenges that hinder them from exercising creativity to incorporate interculturality in their lessons. This is also compounded by lack of clear direction in the teaching materials on cultural elements. The paper, therefore, suggests a holistic approach to the teaching of English language in Malawian school in which the diversity of culture in classrooms must be considered an opportunity for addressing students’ cultural needs that may be lacking in the instructional materials.

Keywords: cultural awareness, grammar, foreign language, intercultural communication, language teaching

Procedia PDF Downloads 301
14434 Making Use of Content and Language Integrated Learning for Teaching Entrepreneurship and Neuromarketing to Master Students: Case Study

Authors: Svetlana Polskaya

Abstract:

The study deals with the issue of using the Content and Language Integrated Learning (CLIL) concept when teaching Master Program students majoring in neuromarketing and entrepreneurship. Present-day employers expect young graduates to conduct professional communication with their English-speaking peers and demonstrate proper knowledge of the industry’s terminology and jargon. The idea of applying CLIL was the result of the above-mentioned students possessing high proficiency in English, thus, not requiring any further knowledge of the English language in terms of traditional grammar or lexis. Due to this situation, a CLIL-type program was devised, allowing learners to acquire new knowledge of entrepreneurship and neuromarketing spheres combined with simultaneous honing their English language practical usage. The case study analyzes CLIL application within this particular program as well as the experience accumulated in the process.

Keywords: CLIL, entrepreneurship, neuromarketing, foreign language acquisition, proficiency level

Procedia PDF Downloads 53
14433 Mood Choices and Modality Patterns in Donald Trump’s Inaugural Presidential Speech

Authors: Mary Titilayo Olowe

Abstract:

The controversies that trailed the political campaign and eventual choice of Donald Trump as the American president is so great that expectations are high as to what the content of his inaugural speech will portray. Given the fact that language is a dynamic vehicle of expressing intentions, the speech needs to be objectively assessed so as to access its content in the manner intended through the three strands of meaning postulated by the Systemic Functional Grammar (SFG): the ideational, the interpersonal and the textual. The focus of this paper, however, is on the interpersonal meaning which deals with how language exhibits social roles and relationship. This paper, therefore, attempts to analyse President Donald Trump’s inaugural speech to elicit interpersonal meaning in it. The analysis is done from the perspective of mood and modality which are housed in SFG. Results of the mood choice which is basically declarative, reveal an information-centered speech while the high option for the modal verb operator ‘will’ shows president Donald Trump’s ability to establish an equal and reliant relationship with his audience, i.e., the Americans. In conclusion, the appeal of the speech to different levels of Interpersonal meaning is largely responsible for its overall effectiveness. One can, therefore, understand the reason for the massive reaction it generates at the center of global discourse.

Keywords: interpersonal, modality, mood, systemic functional grammar

Procedia PDF Downloads 188
14432 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 351
14431 From User's Requirements to UML Class Diagram

Authors: Zeineb Ben Azzouz, Wahiba Ben Abdessalem Karaa

Abstract:

The automated extraction of UML class diagram from natural language requirements is a highly challenging task. Many approaches, frameworks and tools have been presented in this field. Nonetheless, the experiments of these tools have shown that there is no approach that can work best all the time. In this context, we propose a new accurate approach to facilitate the automatic mapping from textual requirements to UML class diagram. Our new approach integrates the best properties of statistical Natural Language Processing (NLP) techniques to reduce ambiguity when analysing natural language requirements text. In addition, our approach follows the best practices defined by conceptual modelling experts to determine some patterns indispensable for the extraction of basic elements and concepts of the class diagram. Once the relevant information of class diagram is captured, a XMI document is generated and imported with a CASE tool to build the corresponding UML class diagram.

Keywords: class diagram, user’s requirements, XMI, software engineering

Procedia PDF Downloads 440
14430 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification

Authors: Zhaoxin Luo, Michael Zhu

Abstract:

In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.

Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese

Procedia PDF Downloads 32
14429 The Significance of Computer Assisted Language Learning in Teaching English Grammar in Tribal Zone of Chhattisgarh

Authors: Yogesh Kumar Tiwari

Abstract:

Chhattisgarh has realized the fundamental role of information and communication technology in the globalized world where knowledge is at the top for the growth and intellectual development. They are spreading so widely that one feels lagging behind if not using them. The influence of these radiating and technological tools has encompassed all aspects of the educational, business, and economic sectors of our world. Undeniably the computer has not only established itself globally in all walks of life but has acquired a fundamental role of paramount importance in the educational process also. This role is getting all pervading and more powerful as computers are being manufactured to be cheaper, smaller in size, adaptable and easy to handle. Computers are becoming indispensable to teachers because of their enormous capabilities and extensive competence. This study aims at observing the effect of using computer based software program of English language on the achievement of undergraduate level students studying in tribal area like Sarguja Division, Chhattisgarh, India. To testify the effect of an innovative teaching in the graduate classroom in tribal area 50 students were randomly selected and separated into two groups. The first group of 25 students were taught English grammar i.e., passive voice/narration, through traditional method using chalk and blackboard asking some formal questions. The second group, the experimental one, was taught English grammar i.e., passive voice/narration, using computer, projector with power point presentation of grammatical items. The statistical analysis was done on the students’ learning capacities and achievement. The result was extremely mesmerizing not only for the teacher but for taught also. The process of the recapitulation demonstrated that the students of experimental group responded the answers of the questions enthusiastically with innovative sense of learning. In light of the findings of the study, it was recommended that teachers and professors of English ought to use self-made instructional program in their teaching process particularly in tribal areas.

Keywords: achievement computer assisted language learning, use of instructional program

Procedia PDF Downloads 122
14428 Passive Voice in SLA: Armenian Learners’ Case Study

Authors: Emma Nemishalyan

Abstract:

It is believed that learners’ mother tongue (L1 hereafter) has a huge impact on their second language acquisition (L2 hereafter). This hypothesis has been exposed to both positive and negative criticism. Based on research results of a wide range of learners’ corpora (Chinese, Japanese, Spanish among others) the hypothesis has either been proved or disproved. However, no such study has been conducted on the Armenian learners. The aim of this paper is to understand the implication of the hypothesis on the Armenian learners’ corpus in terms of the use of the passive voice. To this end, the method of Contrastive Interlanguage Analysis (hereafter CIA) has been used on native speakers’ corpus (Louvain Corpus of Native English Essays (LOCNESS)) and Armenian learners’ corpus which has been compiled by me in compliance with International Corpus of Learner English (ICLE) guidelines. CIA compares the interlanguage (the language produced by learners) with the one produced by native speakers. With the help of this method, it is possible not only to highlight the mistakes that learners make, but also to underline the under or overuses. The choice of the grammar issue (passive voice) is conditioned by the fact that typologically Armenian and English are drastically different as they belong to different branches. Moreover, the passive voice is considered to be one of the most problematic grammar topics to be acquired by learners of the English language. Based on this difference, we hypothesized that Armenian learners would either overuse or underuse some types of the passive voice. With the help of Lancsbox software, we have identified the frequency rates of passive voice usage in LOCNESS and Armenian learners’ corpus to understand whether the latter have the same usage pattern of the passive voice as the native speakers. Secondly, we have identified the types of the passive voice used by the Armenian leaners trying to track down the reasons in their mother tongue. The results of the study showed that Armenian learners underused the passive voices in contrast to native speakers. Furthermore, the hypothesis that learners’ L1 has an impact on learners’ L2 acquisition and production was proved.

Keywords: corpus linguistics, applied linguistics, second language acquisition, corpus compilation

Procedia PDF Downloads 55