Search results for: natural language grammar models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 14549

Search results for: natural language grammar models

14369 Prospects in Teaching Arabic Grammatical Structures to Non-Arab Learners

Authors: Yahya Toyin Muritala, Nonglaksana Kama, Ahmad Yani

Abstract:

The aim of the paper is to investigate various linguistic techniques in enhancing and facilitating the acquisition of the practical knowledge of Arabic grammatical structuring among non-Arab learners of the standard classical Arabic language in non-Arabic speaking academic settings in the course of the current growth of the internationalism and cultural integration in some higher institutions. As the nature of the project requires standard investigations into the unique principal features of Arabic structurings and implications, the findings of the research work suggest some principles to follow in solving the problems faced by learners while acquiring grammatical aspects of Arabic language. The work also concentrates on the the structural features of the language in terms of inflection/parsing, structural arrangement order, functional particles, morphological formation and conformity etc. Therefore, grammatical aspect of Arabic which has gone through major stages in its early evolution of the classical stages up to the era of stagnation, development and modern stage of revitalization is a main subject matter of the paper as it is globally connected with communication and religion of Islam practiced by millions of Arabs and non-Arabs nowadays. The conclusion of the work shows new findings, through the descriptive and analytical methods, in terms of teaching language for the purpose of effective global communication with focus on methods of second language acquisitions by application.

Keywords: language structure, Arabic grammar, classical Arabic, intercultural communication, non-Arabic speaking environment and prospects

Procedia PDF Downloads 373
14368 A BERT-Based Model for Financial Social Media Sentiment Analysis

Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe

Abstract:

The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural language processing in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.

Keywords: BERT, financial markets, Twitter, sentiment analysis

Procedia PDF Downloads 118
14367 Syntax-Related Problems of Translation

Authors: Anna Kesoyan

Abstract:

The present paper deals with the syntax-related problems of translation from English into Armenian. Although Syntax is a part of grammar, syntax-related problems of translation are studied separately during the process of translation. Translation from one language to another is widely accepted as a challenging problem. This becomes even more challenging when the source and target languages are widely different in structure and style, as is the case with English and Armenian. Syntax-related problems of translation from English into Armenian are mainly connected with the syntactical structures of these languages, and particularly, with the word order of the sentence. The word order of the sentence of the Armenian language, which is a synthetic language, is usually characterized as “rather free”, and the word order of the English language, which is an analytical language, is characterized “fixed”. The following research examines the main translation means, particularly, syntactical transformations as the translator has to take real steps while trying to solve certain syntax-related problems. Most of the means of translation are based on the transformation of grammatical components of the sentence, without changing the main information of the text. There are several transformations that occur during translation such as word order of the sentence, transformations of certain grammatical constructions like Infinitive participial construction, Nominative with the Infinitive and Elliptical constructions which have been covered in the following research.

Keywords: elliptical constructions, nominative with the infinitive constructions, fixed and free word order, syntactic structures

Procedia PDF Downloads 412
14366 Using Bidirectional Encoder Representations from Transformers to Extract Topic-Independent Sentiment Features for Social Media Bot Detection

Authors: Maryam Heidari, James H. Jones Jr.

Abstract:

Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event or product. However, this use raises an important question: what percentage of information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a bot, instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. In this paper, we introduce a model for social media bot detection which uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features. Our use of a Natural Language Processing approach to derive topic-independent features for our new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data as generated by a bot or a human, where the most accurate prior work achieved accuracy of 92\%.

Keywords: bot detection, natural language processing, neural network, social media

Procedia PDF Downloads 88
14365 Progress in Combining Image Captioning and Visual Question Answering Tasks

Authors: Prathiksha Kamath, Pratibha Jamkhandi, Prateek Ghanti, Priyanshu Gupta, M. Lakshmi Neelima

Abstract:

Combining Image Captioning and Visual Question Answering (VQA) tasks have emerged as a new and exciting research area. The image captioning task involves generating a textual description that summarizes the content of the image. VQA aims to answer a natural language question about the image. Both these tasks include computer vision and natural language processing (NLP) and require a deep understanding of the content of the image and semantic relationship within the image and the ability to generate a response in natural language. There has been remarkable growth in both these tasks with rapid advancement in deep learning. In this paper, we present a comprehensive review of recent progress in combining image captioning and visual question-answering (VQA) tasks. We first discuss both image captioning and VQA tasks individually and then the various ways in which both these tasks can be integrated. We also analyze the challenges associated with these tasks and ways to overcome them. We finally discuss the various datasets and evaluation metrics used in these tasks. This paper concludes with the need for generating captions based on the context and captions that are able to answer the most likely asked questions about the image so as to aid the VQA task. Overall, this review highlights the significant progress made in combining image captioning and VQA, as well as the ongoing challenges and opportunities for further research in this exciting and rapidly evolving field, which has the potential to improve the performance of real-world applications such as autonomous vehicles, robotics, and image search.

Keywords: image captioning, visual question answering, deep learning, natural language processing

Procedia PDF Downloads 46
14364 Enhancing Technical Trading Strategy on the Bitcoin Market using News Headlines and Language Models

Authors: Mohammad Hosein Panahi, Naser Yazdani

Abstract:

we present a technical trading strategy that leverages the FinBERT language model and financial news analysis with a focus on news related to a subset of Nasdaq 100 stocks. Our approach surpasses the baseline Range Break-out strategy in the Bitcoin market, yielding a remarkable 24.8% increase in the win ratio for all Friday trades and an impressive 48.9% surge in short trades specifically on Fridays. Moreover, we conduct rigorous hypothesis testing to establish the statistical significance of these improvements. Our findings underscore considerable potential of our NLP-driven approach in enhancing trading strategies and achieving greater profitability within financial markets.

Keywords: quantitative finance, technical analysis, bitcoin market, NLP, language models, FinBERT, technical trading

Procedia PDF Downloads 30
14363 Resource Framework Descriptors for Interestingness in Data

Authors: C. B. Abhilash, Kavi Mahesh

Abstract:

Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.

Keywords: RDF, interestingness, knowledge base, semantic data

Procedia PDF Downloads 124
14362 English Pronunciation Materials on TikTok

Authors: Sebastian Leal-Arenas

Abstract:

TikTok’s influence on contemporary society is undeniable. The impact of the mobile app transcends entertainment, as shown by the growing presence of specialized accounts dedicated to providing educational content, particularly as it pertains to language learning. However, the prevailing trend on the platform is vocabulary and grammar acquisition, neglecting a critical component: pronunciation. This study examines English pronunciation materials available on TikTok by taking a comprehensive approach that incorporates established assessment tools, such as the Learning Object Review Instrument and the Framework for Language Learning App Evaluation. Furthermore, novel evaluation categories are introduced to provide a more holistic assessment of these educational resources. 60 English pronunciation videos were part of the analysis. The findings reveal that these audio-visual materials present clear audio bolstered by high-quality video content and automatically generated closed captions. These three components enhance the comprehensibility of the input, making these concise videos valuable assets for language learners. Nevertheless, certain deficiencies are observed, such as the lack of emphasis on specific segments and their relationship with articulators. Improvements and refinements are discussed, as well as their potential utility within the language classroom. This study contributes to the ongoing investigation of multimedia materials used for language teaching and emphasizes the need to adapt pronunciation instruction methods to today’s technology.

Keywords: pronunciation, segments, teaching materials, technology

Procedia PDF Downloads 40
14361 Variational Explanation Generator: Generating Explanation for Natural Language Inference Using Variational Auto-Encoder

Authors: Zhen Cheng, Xinyu Dai, Shujian Huang, Jiajun Chen

Abstract:

Recently, explanatory natural language inference has attracted much attention for the interpretability of logic relationship prediction, which is also known as explanation generation for Natural Language Inference (NLI). Existing explanation generators based on discriminative Encoder-Decoder architecture have achieved noticeable results. However, we find that these discriminative generators usually generate explanations with correct evidence but incorrect logic semantic. It is due to that logic information is implicitly encoded in the premise-hypothesis pairs and difficult to model. Actually, logic information identically exists between premise-hypothesis pair and explanation. And it is easy to extract logic information that is explicitly contained in the target explanation. Hence we assume that there exists a latent space of logic information while generating explanations. Specifically, we propose a generative model called Variational Explanation Generator (VariationalEG) with a latent variable to model this space. Training with the guide of explicit logic information in target explanations, latent variable in VariationalEG could capture the implicit logic information in premise-hypothesis pairs effectively. Additionally, to tackle the problem of posterior collapse while training VariaztionalEG, we propose a simple yet effective approach called Logic Supervision on the latent variable to force it to encode logic information. Experiments on explanation generation benchmark—explanation-Stanford Natural Language Inference (e-SNLI) demonstrate that the proposed VariationalEG achieves significant improvement compared to previous studies and yields a state-of-the-art result. Furthermore, we perform the analysis of generated explanations to demonstrate the effect of the latent variable.

Keywords: natural language inference, explanation generation, variational auto-encoder, generative model

Procedia PDF Downloads 121
14360 Analysis of Speaking Skills in Turkish Language Acquisition as a Foreign Language

Authors: Lokman Gozcu, Sule Deniz Gozcu

Abstract:

This study aims to analyze the skills of speaking in the acquisition of Turkish as a foreign language. One of the most important things for the individual who learns a foreign language is to be successful in the oral communication (speaking) skills and to interact in an understandable way. Speech skill requires much more time and effort than other language skills. In this direction, it is necessary to make an analysis of these oral communication skills, which is important in Turkish language acquisition as a foreign language and to draw out a road map according to the result. The aim of this study is to determine the competence and attitudes of speaking competence according to the individuals who learn Turkish as a foreign language and to be considered as speaking skill elements; Grammar, emphasis, intonation, body language, speed, ranking, accuracy, fluency, pronunciation, etc. and the results and suggestions based on these determinations. A mixed method has been chosen for data collection and analysis. A Likert scale (for competence and attitude) was applied to 190 individuals who were interviewed face-to-face (for speech skills) with a semi-structured interview form about 22 participants randomly selected. In addition, the observation form related to the 22 participants interviewed were completed by the researcher during the interview, and after the completion of the collection of all the voice recordings, analyses of voice recordings with the speech skills evaluation scale was made. The results of the research revealed that the speech skills of the individuals who learned Turkish as a foreign language have various perspectives. According to the results, the most inadequate aspects of the participants' ability to speak in Turkish include vocabulary, using humorous elements while speaking Turkish, being able to include items such as idioms and proverbs while speaking Turkish, Turkish fluency respectively. In addition, the participants were found not to feel comfortable while speaking Turkish, to feel ridiculous and to be nervous while speaking in formal settings. There are conclusions and suggestions for the situations that arise after the have been analyses made.

Keywords: learning Turkish as a foreign language, proficiency criteria, phonetic (modalities), speaking skills

Procedia PDF Downloads 213
14359 ‘Non-Legitimate’ Voices as L2 Models: Towards Becoming a Legitimate L2 Speaker

Authors: M. Rilliard

Abstract:

Based on a Multiliteracies-inspired and sociolinguistically-informed advanced French composition class, this study employed autobiographical narratives from speakers traditionally considered non-legitimate models for L2 teaching purposes of inspiring students to develop an authentic L2 voice and to see themselves as legitimate L2 speakers. Students explored their L2 identities in French through a self-inspired fictional character. Two autobiographical narratives of identity quest by non-traditional French speakers provided them guidance through this process: the novel Le Bleu des Abeilles (2013) and the film Qu’Allah Bénisse la France (2014). Written and French oral productions for different genres, as well as metalinguistic reflections in English, were collected and analyzed. Results indicate that ideas and materials that were relatable to students, namely relatable experiences and relatable language, were most useful to them in developing their L2 voices and achieving authentic and legitimate L2 speakership. These results point towards the benefits of using non-traditional speakers as pedagogical models, as they serve to legitimize students’ sense of their own L2-speakership, which ultimately leads them towards a better, more informed, mastery of the language.

Keywords: foreign language classroom, L2 identity, L2 learning and teaching, L2 writing, sociolinguistics

Procedia PDF Downloads 97
14358 Enhancing English Language Learning through Learners Cultural Background

Authors: A. Attahiru, Rabi Abdullahi Danjuma, Fatima Bint

Abstract:

Language and culture are two concepts which are closely related that one affects the other. This paper attempts to examine the definition of language and culture by discussing the relationship between them. The paper further presents some instructional strategies for the teaching of language and culture as well as the influence of culture on language. It also looks at its implication to language education and finally some recommendation and conclusion were drawn.

Keywords: culture, language, relationship, strategies, teaching

Procedia PDF Downloads 373
14357 The End Justifies the Means: Using Programmed Mastery Drill to Teach Spoken English to Spanish Youngsters, without Relying on Homework

Authors: Robert Pocklington

Abstract:

Most current language courses expect students to be ‘vocational’, sacrificing their free time in order to learn. However, pupils with a full-time job, or bringing up children, hardly have a spare moment. Others just need the language as a tool or a qualification, as if it were book-keeping or a driving license. Then there are children in unstructured families whose stressful life makes private study almost impossible. And the countless parents whose evenings and weekends have become a nightmare, trying to get the children to do their homework. There are many arguments against homework being a necessity (rather than an optional extra for more ambitious or dedicated students), making a clear case for teaching methods which facilitate full learning of the key content within the classroom. A methodology which could be described as Programmed Mastery Learning has been used at Fluency Language Academy (Spain) since 1992, to teach English to over 4000 pupils yearly, with a staff of around 100 teachers, barely requiring homework. The course is structured according to the tenets of Programmed Learning: small manageable teaching steps, immediate feedback, and constant successful activity. For the Mastery component (not stopping until everyone has learned), the memorisation and practice are entrusted to flashcard-based drilling in the classroom, leading all students to progress together and develop a permanently growing knowledge base. Vocabulary and expressions are memorised using flashcards as stimuli, obliging the brain to constantly recover words from the long-term memory and converting them into reflex knowledge, before they are deployed in sentence building. The use of grammar rules is practised with ‘cue’ flashcards: the brain refers consciously to the grammar rule each time it produces a phrase until it comes easily. This automation of lexicon and correct grammar use greatly facilitates all other language and conversational activities. The full B2 course consists of 48 units each of which takes a class an average of 17,5 hours to complete, allowing the vast majority of students to reach B2 level in 840 class hours, which is corroborated by an 85% pass-rate in the Cambridge University B2 exam (First Certificate). In the past, studying for qualifications was just one of many different options open to young people. Nowadays, youngsters need to stay at school and obtain qualifications in order to get any kind of job. There are many students in our classes who have little intrinsic interest in what they are studying; they just need the certificate. In these circumstances and with increasing government pressure to minimise failure, teachers can no longer think ‘If they don’t study, and fail, its their problem’. It is now becoming the teacher’s problem. Teachers are ever more in need of methods which make their pupils successful learners; this means assuring learning in the classroom. Furthermore, homework is arguably the main divider between successful middle-class schoolchildren and failing working-class children who drop out: if everything important is learned at school, the latter will have a much better chance, favouring inclusiveness in the language classroom.

Keywords: flashcard drilling, fluency method, mastery learning, programmed learning, teaching English as a foreign language

Procedia PDF Downloads 82
14356 Query in Grammatical Forms and Corpus Error Analysis

Authors: Katerina Florou

Abstract:

Two decades after coined the term "learner corpora" as collections of texts created by foreign or second language learners across various language contexts, and some years following suggestion to incorporate "focusing on form" within a Task-Based Learning framework, this study aims to explore how learner corpora, whether annotated with errors or not, can facilitate a focus on form in an educational setting. Argues that analyzing linguistic form serves the purpose of enabling students to delve into language and gain an understanding of different facets of the foreign language. This same objective is applicable when analyzing learner corpora marked with errors or in their raw state, but in this scenario, the emphasis lies on identifying incorrect forms. Teachers should aim to address errors or gaps in the students' second language knowledge while they engage in a task. Building on this recommendation, we compared the written output of two student groups: the first group (G1) employed the focusing on form phase by studying a specific aspect of the Italian language, namely the past participle, through examples from native speakers and grammar rules; the second group (G2) focused on form by scrutinizing their own errors and comparing them with analogous examples from a native speaker corpus. In order to test our hypothesis, we created four learner corpora. The initial two were generated during the task phase, with one representing each group of students, while the remaining two were produced as a follow-up activity at the end of the lesson. The results of the first comparison indicated that students' exposure to their own errors can enhance their grasp of a grammatical element. The study is in its second stage and more results are to be announced.

Keywords: Corpus interlanguage analysis, task based learning, Italian language as F1, learner corpora

Procedia PDF Downloads 18
14355 Cross-Dialect Sentence Transformation: A Comparative Analysis of Language Models for Adapting Sentences to British English

Authors: Shashwat Mookherjee, Shruti Dutta

Abstract:

This study explores linguistic distinctions among American, Indian, and Irish English dialects and assesses various Language Models (LLMs) in their ability to generate British English translations from these dialects. Using cosine similarity analysis, the study measures the linguistic proximity between original British English translations and those produced by LLMs for each dialect. The findings reveal that Indian and Irish English translations maintain notably high similarity scores, suggesting strong linguistic alignment with British English. In contrast, American English exhibits slightly lower similarity, reflecting its distinct linguistic traits. Additionally, the choice of LLM significantly impacts translation quality, with Llama-2-70b consistently demonstrating superior performance. The study underscores the importance of selecting the right model for dialect translation, emphasizing the role of linguistic expertise and contextual understanding in achieving accurate translations.

Keywords: cross-dialect translation, language models, linguistic similarity, multilingual NLP

Procedia PDF Downloads 17
14354 Aspects of Diglossia in Arabic Language Learning

Authors: Adil Ishag

Abstract:

Diglossia emerges in a situation where two distinctive varieties of a language are used alongside within a certain community. In this case, one is considered as a high or standard variety and the second one as a low or colloquial variety. Arabic is an extreme example of a highly diglossic language. This diglossity is due to the fact that Arabic is one of the most spoken languages and spread over 22 Countries in two continents as a mother tongue, and it is also widely spoken in many other Islamic countries as a second language or simply the language of Quran. The geographical variation between the countries where the language is spoken and the duality of the classical Arabic and daily spoken dialects in the Arab world on the other hand; makes the Arabic language one of the most diglossic languages. This paper tries to investigate this phenomena and its relation to learning Arabic as a first and second language.

Keywords: Arabic language, diglossia, first and second language, language learning

Procedia PDF Downloads 524
14353 Learning Spanish as a Second Language: Using Infinitives as Verbal Complements

Authors: Jiyoung Yoon

Abstract:

This study examines Spanish textbook explanations of infinitival complements and how they can affect a learner’s second-language acquisition process. Verbs taking infinitival complements are commonly found in the mandate, volition, and emotion verbs, both for Spanish and English. However, while some English verbs take gerunds (María avoids eating/*to eat meat), in Spanish a gerund never functions as the complement of a verb (María evita comer/*comiendo carne). Because of these differences, English learners of Spanish often have difficulty acquiring infinitival complement constructions in Spanish. Specifically, they may employ English-like complement structures, producing such ungrammatical utterances as *Odio comiendo tacos ‘I hate eating tacos.' A compounding factor is that many Spanish textbooks do not emphasize the usages of infinitival complements and, when explanations are provided, they are often vague and insufficient. This study examines Spanish textbook explanations of infinitival complements (intermediate and advanced college-level Spanish textbooks and grammar reference books published in the United States) to determine areas that are problematic and insufficient and how they can affect learners’ second-language acquisition process. In this study, alternative principle-driven explanations are proposed as a replacement.

Keywords: Spanish, teaching, second language, infinitival complement, textbook

Procedia PDF Downloads 330
14352 Evolution of Classroom Languaging over the Years: Prospects for Teaching Mathematics Differently

Authors: Jabulani Sibanda, Clemence Chikiwa

Abstract:

This paper traces diverse language practices representative of equally diverse conceptions of language. To be dynamic with languaging practices, one needs to appreciate nuanced languaging practices, their challenges, prospects, and opportunities. The paper presents what we envision as three major conceptions of language that give impetus to diverse language practices. It examines theoretical models of the bilingual mental lexicon and how they inform language practices. The paper explores classroom languaging practices that have been promulgated and experimented with. The paper advocates the deployment of multisensory semiotic systems to complement linguistic classroom communication and the acknowledgement of learners’ linguistic and semiotic resources as valid in the learning enterprise. It recommends the enactment of specific clauses on language in education policies and curriculum documents that empower classroom interactants to exercise discretion in languaging practices.

Keywords: languaging, monolingual, multilingual, semiotic and linguistic repertoire

Procedia PDF Downloads 37
14351 Equivalences and Contrasts in the Morphological Formation of Echo Words in Two Indo-Aryan Languages: Bengali and Odia

Authors: Subhanan Mandal, Bidisha Hore

Abstract:

The linguistic process whereby repetition of all or part of the base word with or without internal change before or after the base itself takes place is regarded as reduplication. The reduplicated morphological construction annotates with itself a new grammatical category and meaning. Reduplication is a very frequent and abundant phenomenon in the eastern Indian languages from the states of West Bengal and Odisha, i.e. Bengali and Odia respectively. Bengali, an Indo-Aryan language and a part of the Indo-European language family is one of the largest spoken languages in India and is the national language of Bangladesh. Despite this classification, Bengali has certain influences in terms of vocabulary and grammar due to its geographical proximity to Tibeto-Burman and Austro-Asiatic language speaking communities. Bengali along with Odia belonged to a single linguistic branch. But with time and gradual linguistic changes due to various factors, Odia was the first to break away and develop as a separate distinct language. However, less of contrasts and more of similarities still exist among these languages along the line of linguistics, leaving apart the script. This paper deals with the procedure of echo word formations in Bengali and Odia. The morphological research of the two languages concerning the field of reduplication reveals several linguistic processes. The revelation is based on the information elicited from native language speakers and also on the analysis of echo words found in discourse and conversational patterns. For the purpose of partial reduplication analysis, prefixed class and suffixed class word formations are taken into consideration which show specific rule based changes. For example, in suffixed class categorization, both consonant and vowel alterations are found, following the rules: i) CVx à tVX, ii) CVCV à CVCi. Further classifications were also found on sentential studies of both languages which revealed complete reduplication complexities while forming echo words where the head word lose its original meaning. Complexities based on onomatopoetic/phonetic imitation of natural phenomena and not according to any rule-based occurrences were also found. Taking these aspects into consideration which are very prevalent in both the languages, inferences are drawn from the study which bring out many similarities in both the languages in this area in spite of branching away from each other several years ago.

Keywords: consonant alteration, onomatopoetic, partial reduplication and complete reduplication, reduplication, vowel alteration

Procedia PDF Downloads 213
14350 Exploring the Potential of Replika: An AI Chatbot for Mental Health Support

Authors: Nashwah Alnajjar

Abstract:

This research paper provides an overview of Replika, an AI chatbot application that uses natural language processing technology to engage in conversations with users. The app was developed to provide users with a virtual AI friend who can converse with them on various topics, including mental health. This study explores the experiences of Replika users using quantitative research methodology. A survey was conducted with 12 participants to collect data on their demographics, usage patterns, and experiences with the Replika app. The results showed that Replika has the potential to play a role in mental health support and well-being.

Keywords: Replika, chatbot, mental health, artificial intelligence, natural language processing

Procedia PDF Downloads 50
14349 Evolution of Classroom Languaging in Multilingual Contexts: Challenges and Prospects

Authors: Jabulani Sibanda, Clemence Chikiwa

Abstract:

This paper traces diverse language practices representative of equally diverse conceptions of language. To be dynamic with languaging practices, one needs to appreciate nuanced languaging practices, their challenges, prospects, and opportunities. The paper presents what we envision as three major conceptions of language that give impetus to diverse language practices. It examines theoretical models of the bilingual mental lexicon and how they inform language practices. The paper explores classroom languaging practices that have been promulgated and experimented with. The paper advocates the deployment of multisensory semiotic systems to complement linguistic classroom communication and the acknowledgement of learners’ linguistic and semiotic resources as valid in the learning enterprise. It recommends the enactment of specific clauses on language in education policies and curriculum documents that empower classroom interactants to exercise discretion in languaging practices.

Keywords: languaging, monolingual, multilingual, semiotic and linguistic repertoire

Procedia PDF Downloads 30
14348 The Queer Language: A Case Study of the Hyderabadi Queers

Authors: Sreerakuvandana Vandana

Abstract:

Although the term third gender is relatively new, the language that is in use has already made its way to the concept of identity. With the vast recognition and the transparency in expressing their identity without a tint of embarrassment, it is highly essential to take into account the idea of “identity” and “language”. The community however picks up language as a tool to assert their presence in the “mainstream”, albeit contradictory practices. The paper is an attempt to see how Koti claims and tries to be a language just like any other language. With that, it also identifies how the community wants to be identified as a unique group, but yet want to remain grounded to the ‘mainstream’. The work is an attempt to bring out the secret language of the LGBT community and understand their desire to be recognized as "main stream." The paper is also an attempt to bring into light this language and see if it qualifies to be a language at all.

Keywords: identity, language, queer, transgender

Procedia PDF Downloads 502
14347 Testing Chat-GPT: An AI Application

Authors: Jana Ismail, Layla Fallatah, Maha Alshmaisi

Abstract:

ChatGPT, a cutting-edge language model built on the GPT-3.5 architecture, has garnered attention for its profound natural language processing capabilities, holding promise for transformative applications in customer service and content creation. This study delves into ChatGPT's architecture, aiming to comprehensively understand its strengths and potential limitations. Through systematic experiments across diverse domains, such as general knowledge and creative writing, we evaluated the model's coherence, context retention, and task-specific accuracy. While ChatGPT excels in generating human-like responses and demonstrates adaptability, occasional inaccuracies and sensitivity to input phrasing were observed. The study emphasizes the impact of prompt design on output quality, providing valuable insights for the nuanced deployment of ChatGPT in conversational AI and contributing to the ongoing discourse on the evolving landscape of natural language processing in artificial intelligence.

Keywords: artificial Inelegance, chatGPT, open AI, NLP

Procedia PDF Downloads 32
14346 A Grey-Box Text Attack Framework Using Explainable AI

Authors: Esther Chiramal, Kelvin Soh Boon Kai

Abstract:

Explainable AI is a strong strategy implemented to understand complex black-box model predictions in a human-interpretable language. It provides the evidence required to execute the use of trustworthy and reliable AI systems. On the other hand, however, it also opens the door to locating possible vulnerabilities in an AI model. Traditional adversarial text attack uses word substitution, data augmentation techniques, and gradient-based attacks on powerful pre-trained Bidirectional Encoder Representations from Transformers (BERT) variants to generate adversarial sentences. These attacks are generally white-box in nature and not practical as they can be easily detected by humans e.g., Changing the word from “Poor” to “Rich”. We proposed a simple yet effective Grey-box cum Black-box approach that does not require the knowledge of the model while using a set of surrogate Transformer/BERT models to perform the attack using Explainable AI techniques. As Transformers are the current state-of-the-art models for almost all Natural Language Processing (NLP) tasks, an attack generated from BERT1 is transferable to BERT2. This transferability is made possible due to the attention mechanism in the transformer that allows the model to capture long-range dependencies in a sequence. Using the power of BERT generalisation via attention, we attempt to exploit how transformers learn by attacking a few surrogate transformer variants which are all based on a different architecture. We demonstrate that this approach is highly effective to generate semantically good sentences by changing as little as one word that is not detectable by humans while still fooling other BERT models.

Keywords: BERT, explainable AI, Grey-box text attack, transformer

Procedia PDF Downloads 107
14345 Reemergence of Behaviorism in Language Teaching

Authors: Hamid Gholami

Abstract:

During the years, the language teaching methods have been the offshoots of schools of thought in psychology. The methods were mainly influenced by their contemporary psychological approaches, as Audiolingualism was based on behaviorism and Communicative Language Teaching on constructivism. In 1950s, the text books were full of repetition exercises which were encouraged by Behaviorism. In 1980s they got filled with communicative exercises as suggested by constructivism. The trend went on to nowadays that sees no specific method as prevalent since none of the schools of thought seem to be illustrative of the complexity in human being learning. But some changes can be notable; some textbooks are giving more and more space to repetition exercises at least to enhance some aspects of language proficiency, namely collocations, rhythm and intonation, and conversation models. These changes may mark the reemergence of one of the once widely accepted schools of thought in psychology; behaviorism.

Keywords: language teaching methods, psychology, schools of thought, Behaviorism

Procedia PDF Downloads 511
14344 2L1, a Bridge between L1 and L2

Authors: Elena Ginghina

Abstract:

There are two major categories of language acquisition: first and second language acquisition, which distinguish themselves in their learning process and in their ultimate attainment. However, in the case of a bilingual child, one of the languages he grows up with receives gradually the features of a second language. This phenomenon characterizes the successive first language acquisition, when the initial state of the child is already marked by another language. Nevertheless, the dominance of the languages can change throughout the life, if the exposure to language and the quality of the input are better in 2L1. Related to the exposure to language and the quality of the input, there are cases even at the simultaneous bilingualism, where the two languages although learned from birth one, differ from one another at some point. This paper aims to see, what makes a 2L1 to become a second language and under what circumstances can a L2 learner reach a native or a near native speaker level.

Keywords: bilingualism, first language acquisition, native speakers of German, second language acquisition

Procedia PDF Downloads 537
14343 A Supervised Approach for Word Sense Disambiguation Based on Arabic Diacritics

Authors: Alaa Alrakaf, Sk. Md. Mizanur Rahman

Abstract:

Since the last two decades’ Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate a word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text then found the most accurate sense of an ambiguous word using Naïve Bayes Classifier. Our Experimental study proves that using Arabic Diacritics with Naïve Bayes Classifier enhances the accuracy of choosing the appropriate sense by 23% and also decreases the ambiguity in machine translation.

Keywords: Arabic natural language processing, machine learning, machine translation, Naive bayes classifier, word sense disambiguation

Procedia PDF Downloads 324
14342 Developing Language Ownership: An Autoethnographic Perspective on Transformative Learning

Authors: Thomas Abbey

Abstract:

This paper is part of an ongoing research addressing the experience of language learners in developing a sense of language ownership in their second language. For the majority of language learners, the main goal of learning a second or foreign language is to develop proficiency in the target language. Language proficiency comprises numerous intersecting competency skills ranging from causally listening to speaking using certain registers. This autoethnography analyzes lived experiences related to transitioning from learning a language in a classroom to being in an environment where the researcher's second language is the primary means of communication. Focused on lived experiences, the purpose of this research is to provide an insight into the experiences of language learners entering new environments and needing to navigate life within another language. Through reflections, this paper offers a critical account of experience traveling to Baku, Azerbaijan as a Russian language learner. The analysis for this paper focuses on the development of a sense of language ownership.

Keywords: autoethnography, language learning, language ownership, transformative learning

Procedia PDF Downloads 26
14341 Evaluating the Role of Multisensory Elements in Foreign Language Acquisition

Authors: Sari Myréen

Abstract:

The aim of this study was to evaluate the role of multisensory elements in enhancing and facilitating foreign language acquisition among adult students in a language classroom. The use of multisensory elements enables the creation of a student-centered classroom, where the focus is on individual learner’s language learning process, perceptions and motivation. Multisensory language learning is a pedagogical approach where the language learner uses all the senses more effectively than in a traditional in-class environment. Language learning is facilitated due to multisensory stimuli which increase the number of cognitive connections in the learner and take into consideration different types of learners. A living lab called Multisensory Space creates a relaxed and receptive state in the learners through various multisensory stimuli, and thus promotes their natural foreign language acquisition. Qualitative and quantitative data were collected in two questionnaire inquiries among the Finnish students of a higher education institute at the end of their basic French courses in December 2014 and 2016. The inquiries discussed the effects of multisensory elements on the students’ motivation to study French as well as their learning outcomes. The results show that the French classes in the Multisensory Space provide the students with an encouraging and pleasant learning environment, which has a positive impact on their motivation to study the foreign language as well as their language learning outcomes.

Keywords: foreign language acquisition, pedagogical approach, multisensory learning, transcultural learning

Procedia PDF Downloads 352
14340 Profiling Risky Code Using Machine Learning

Authors: Zunaira Zaman, David Bohannon

Abstract:

This study explores the application of machine learning (ML) for detecting security vulnerabilities in source code. The research aims to assist organizations with large application portfolios and limited security testing capabilities in prioritizing security activities. ML-based approaches offer benefits such as increased confidence scores, false positives and negatives tuning, and automated feedback. The initial approach using natural language processing techniques to extract features achieved 86% accuracy during the training phase but suffered from overfitting and performed poorly on unseen datasets during testing. To address these issues, the study proposes using the abstract syntax tree (AST) for Java and C++ codebases to capture code semantics and structure and generate path-context representations for each function. The Code2Vec model architecture is used to learn distributed representations of source code snippets for training a machine-learning classifier for vulnerability prediction. The study evaluates the performance of the proposed methodology using two datasets and compares the results with existing approaches. The Devign dataset yielded 60% accuracy in predicting vulnerable code snippets and helped resist overfitting, while the Juliet Test Suite predicted specific vulnerabilities such as OS-Command Injection, Cryptographic, and Cross-Site Scripting vulnerabilities. The Code2Vec model achieved 75% accuracy and a 98% recall rate in predicting OS-Command Injection vulnerabilities. The study concludes that even partial AST representations of source code can be useful for vulnerability prediction. The approach has the potential for automated intelligent analysis of source code, including vulnerability prediction on unseen source code. State-of-the-art models using natural language processing techniques and CNN models with ensemble modelling techniques did not generalize well on unseen data and faced overfitting issues. However, predicting vulnerabilities in source code using machine learning poses challenges such as high dimensionality and complexity of source code, imbalanced datasets, and identifying specific types of vulnerabilities. Future work will address these challenges and expand the scope of the research.

Keywords: code embeddings, neural networks, natural language processing, OS command injection, software security, code properties

Procedia PDF Downloads 71