Search results for: Arabic natural language processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12353

Search results for: Arabic natural language processing

12173 Prediction, Production, and Comprehension: Exploring the Influence of Salience in Language Processing

Authors: Andy H. Clark

Abstract:

This research looks into the relationship between language comprehension and production with a specific focus on the role of salience in shaping these processes. Salience, our most immediate perception of what is most probable out of all possible situations and outcomes strongly affects our perception and action in language production and comprehension. This study investigates the impact of geographic and emotional attachments to the target language on the differences in the learners’ comprehension and production abilities. Using quantitative research methods (Qualtrics, SPSS), this study examines preferential choices of two groups of Japanese English language learners: those residing in the United States and those in Japan. By comparing and contrasting these two groups, we hope to gain a better understanding of how salience of linguistics cues influences language processing.

Keywords: intercultural pragmatics, salience, production, comprehension, pragmatics, action, perception, cognition

Procedia PDF Downloads 75
12172 Learner's Difficulties Acquiring English: The Case of Native Speakers of Rio de La Plata Spanish Towards Justifying the Need for Corpora

Authors: Maria Zinnia Bardas Hoffmann

Abstract:

Contrastive Analysis (CA) is the systematic comparison between two languages. It stems from the notion that errors are caused by interference of the L1 system in the acquisition process of an L2. CA represents a useful tool to understand the nature of learning and acquisition. Also, this particular method promises a path to un-derstand the nature of underlying cognitive processes, even when other factors such as intrinsic motivation and teaching strategies were found to best explain student’s problems in acquisition. CA study is justified not only from the need to get a deeper understanding of the nature of SLA, but as an invaluable source to provide clues, at a cognitive level, for those general processes involved in rule formation and abstract thought. It is relevant for cross disciplinary studies and the fields of Computational Thought, Natural Language processing, Applied Linguistics, Cognitive Linguistics and Math Theory. That being said, this paper intends to address here as well its own set of constraints and limitations. Finally, this paper: (a) aims at identifying some of the difficulties students may find in their learning process due to the nature of their specific variety of L1, Rio de la Plata Spanish (RPS), (b) represents an attempt to discuss the necessity for specific models to approach CA.

Keywords: second language acquisition, applied linguistics, contrastive analysis, applied contrastive analysis English language department, meta-linguistic rules, cross-linguistics studies, computational thought, natural language processing

Procedia PDF Downloads 151
12171 Using Mining Methods of WEKA to Predict Quran Verb Tense and Aspect in Translations from Arabic to English: Experimental Results and Analysis

Authors: Jawharah Alasmari

Abstract:

In verb inflection, tense marks past/present/future action, and aspect marks progressive/continues perfect/completed actions. This usage and meaning of tense and aspect differ in Arabic and English. In this research, we applied data mining methods to test the predictive function of candidate features by using our dataset of Arabic verbs in-context, and their 7 translations. Weka machine learning classifiers is used in this experiment in order to examine the key features that can be used to provide guidance to enable a translator’s appropriate English translation of the Arabic verb tense and aspect.

Keywords: Arabic verb, English translations, mining methods, Weka software

Procedia PDF Downloads 272
12170 Symmetric Arabic Language Encryption Technique Based on Modified Playfair Algorithm

Authors: Fairouz Beggas

Abstract:

Due to the large number of exchanges in the networks, the security of communications is essential. Most ways of keeping communication secure rely on encryption. In this work, a symmetric encryption technique is offered to encrypt and decrypt simple Arabic scripts based on a multi-level security. A proposed technique uses an idea of Playfair encryption with a larger table size and an additional layer of encryption to ensure more security. The idea of the proposed algorithm aims to generate a dynamic table that depends on a secret key. The same secret key is also used to create other secret keys to over-encrypt the plaintext in three steps. The obtained results show that the proposed algorithm is faster in terms of encryption/decryption speed and can resist to many types of attacks.

Keywords: arabic data, encryption, playfair, symmetric algorithm

Procedia PDF Downloads 90
12169 An Event-Related Potentials Study on the Processing of English Subjunctive Mood by Chinese ESL Learners

Authors: Yan Huang

Abstract:

Event-related potentials (ERPs) technique helps researchers to make continuous measures on the whole process of language comprehension, with an excellent temporal resolution at the level of milliseconds. The research on sentence processing has developed from the behavioral level to the neuropsychological level, which brings about a variety of sentence processing theories and models. However, the applicability of these models to L2 learners is still under debate. Therefore, the present study aims to investigate the neural mechanisms underlying English subjunctive mood processing by Chinese ESL learners. To this end, English subject clauses with subjunctive moods are used as the stimuli, all of which follow the same syntactic structure, “It is + adjective + that … + (should) do + …” Besides, in order to examine the role that language proficiency plays on L2 processing, this research deals with two groups of Chinese ESL learners (18 males and 22 females, mean age=21.68), namely, high proficiency group (Group H) and low proficiency group (Group L). Finally, the behavioral and neurophysiological data analysis reveals the following findings: 1) Syntax and semantics interact with each other on the SECOND phase (300-500ms) of sentence processing, which is partially in line with the Three-phase Sentence Model; 2) Language proficiency does affect L2 processing. Specifically, for Group H, it is the syntactic processing that plays the dominant role in sentence processing while for Group L, semantic processing also affects the syntactic parsing during the THIRD phase of sentence processing (500-700ms). Besides, Group H, compared to Group L, demonstrates a richer native-like ERPs pattern, which further demonstrates the role of language proficiency in L2 processing. Based on the research findings, this paper also provides some enlightenment for the L2 pedagogy as well as the L2 proficiency assessment.

Keywords: Chinese ESL learners, English subjunctive mood, ERPs, L2 processing

Procedia PDF Downloads 131
12168 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 452
12167 The Audio-Visual and Syntactic Priming Effect on Specific Language Impairment and Gender in Modern Standard Arabic

Authors: Mohammad Al-Dawoody

Abstract:

This study aims at exploring if priming is affected by gender in Modern Standard Arabic and if it is restricted solely to subjects with no specific language impairment (SLI). The sample in this study consists of 74 subjects, between the ages of 11;1 and 11;10, distributed into (a) 2 SLI experimental groups of 38 subjects divided into two gender groups of 18 females and 20 males and (b) 2 non-SLI control groups of 36 subjects divided into two gender groups of 17 females and 19 males. Employing a mixed research design, the researcher conducted this study within the framework of the relevance theory (RT) whose main assumption is that human beings are endowed with a biological ability to magnify the relevance of the incoming stimuli. Each of the four groups was given two different priming stimuli: audio-visual priming (T1) and syntactic priming (T2). The results showed that the priming effect was sheer distinct among SLI participants especially when retrieving typical responses (TR) in T1 and T2 with slight superiority of males over females. The results also revealed that non-SLI females showed stronger original response (OR) priming in T1 than males and that non-SLI males in T2 excelled in OR priming than females. Furthermore, the results suggested that the audio-visual priming has a stronger effect on SLI females than non-SLI females and that syntactic priming seems to have the same effect on the two groups (non-SLI and SLI females). The conclusion is that the priming effect varies according to gender and is not confined merely to non-SLI subjects.

Keywords: specific language impairment, relevance theory, audio-visual priming, syntactic priming, modern standard Arabic

Procedia PDF Downloads 177
12166 The Impact of Recurring Events in Fake News Detection

Authors: Ali Raza, Shafiq Ur Rehman Khan, Raja Sher Afgun Usmani, Asif Raza, Basit Umair

Abstract:

Detection of Fake news and missing information is gaining popularity, especially after the advancement in social media and online news platforms. Social media platforms are the main and speediest source of fake news propagation, whereas online news websites contribute to fake news dissipation. In this study, we propose a framework to detect fake news using the temporal features of text and consider user feedback to identify whether the news is fake or not. In recent studies, the temporal features in text documents gain valuable consideration from Natural Language Processing and user feedback and only try to classify the textual data as fake or true. This research article indicates the impact of recurring and non-recurring events on fake and true news. We use two models BERT and Bi-LSTM to investigate, and it is concluded from BERT we get better results and 70% of true news are recurring and rest of 30% are non-recurring.

Keywords: natural language processing, fake news detection, machine learning, Bi-LSTM

Procedia PDF Downloads 25
12165 Exploring Motivation and Attitude to Second Language Learning in Ugandan Secondary Schools

Authors: Nanyonjo Juliet

Abstract:

Across Sub-Saharan Africa, it’s increasingly becoming an absolute necessity for either parents or governments to encourage learners, most particularly those attending high schools, to study a second or foreign language other than the “official language” or the language of instruction in schools. The major second or foreign languages under consideration include but are not necessarily limited to English, French, German, Arabic, Swahili/Kiswahili, Spanish and Chinese. The benefits of learning a second (foreign) language in the globalized world cannot be underestimated. Amongst others, it has been expounded to especially involve such opportunities related to traveling, studying abroad and widening one’s career prospects. Research has also revealed that beyond these non-cognitive rewards, learning a second language enables learners to become more thoughtful, considerate and confident, make better decisions, keep their brain healthier and generally – speaking, broaden their world views. The methodology of delivering a successful 2nd language – learning process by a professionally qualified teacher is located in motivation. We strongly believe that the psychology involved in teaching a foreign language is of paramount importance to a learner’s successful learning experience. The aim of this paper, therefore, is to explore and show the importance of motivation in the teaching and learning of a given 2nd (foreign) language in the local Ugandan high schools.

Keywords: second language, foreign language, language learning, language teaching, official language, language of instruction, globalized world, cognitive rewards, non-cognitive rewards, learning process, motivation

Procedia PDF Downloads 68
12164 Rendering Religious References in English: Naguib Mahfouz in the Arabic as a Foreign Language Classroom

Authors: Shereen Yehia El Ezabi

Abstract:

The transition from the advanced to the superior level of Arabic proficiency is widely known to pose considerable challenges for English speaking students of Arabic as a Foreign Language (AFL). Apart from the increasing complexity of the grammar at this juncture, together with the sprawling vocabulary, to name but two of those challenges, there is also the somewhat less studied hurdle along the way to superior level proficiency, namely, the seeming opacity of many aspects of Arab/ic culture to such learners. This presentation tackles one specific dimension of such issues: religious references in literary texts. It illustrates how carefully constructed translation activities may be used to expand and deepen students’ understanding and use of them. This is shown to be vital for making the leap to the desired competency, given that such elements, as reflected in customs, traditions, institutions, worldviews, and formulaic expressions lie at the very core of Arabic culture and, as such, pervade all modes and levels of Arabic discourse. A short story from the collection “Stories from Our Alley”, by preeminent novelist Naguib Mahfouz is selected for use in this context, being particularly replete with such religious references, of which religious expressions will form the focus of the presentation. As a miniature literary work, it provides an organic whole, so to speak, within which to explore with the class the most precise denotation, as well as the subtlest connotation of each expression in an effort to reach the ‘best’ English rendering. The term ‘best’ refers to approximating the meaning in its full complexity from the source text, in this case Arabic, to the target text, English, according to the concept of equivalence in translation theory. The presentation will show how such a process generates the sort of thorough discussion and close text analysis which allows students to gain valuable insight into this central idiom of Arabic. A variety of translation methods will be highlighted, gleaned from the presenter’s extensive work with advanced/superior students in the Center for Arabic Study Abroad (CASA) program at the American University in Cairo. These begin with the literal rendering of expressions, with the purpose of reinforcing vocabulary learning and practicing the rules of derivational morphology as they form each word, since the larger context remains that of an AFL class, as opposed to a translation skills program. However, departures from the literal approach are subsequently explored by degrees, moving along the spectrum of functional and pragmatic freer translations in order to transmit the ‘real’ meaning in readable English to the target audience- no matter how culture/religion specific the expression- while remaining faithful to the original. Samples from students’ work pre and post discussion will be shared, demonstrating how class consensus is formed as to the final English rendering, proposed as the closest match to the Arabic, and shown to be the result of the above activities. Finally, a few examples of translation work which students have gone on to publish will be shared to corroborate the effectiveness of this teaching practice.

Keywords: superior level proficiency in Arabic as a foreign language, teaching Arabic as a foreign language, teaching idiomatic expressions, translation in foreign language teaching

Procedia PDF Downloads 199
12163 Leveraging Sentiment Analysis for Quality Improvement in Digital Healthcare Services

Authors: Naman Jain, Shaun Fernandes

Abstract:

With the increasing prevalence of online healthcare services, selecting the most suitable doctor has become a complex task, requiring careful consideration of both public sentiment and personal preferences. This paper proposes a sentiment analysis-driven method that integrates public reviews with user-specific criteria and correlated attributes to recommend online doctors. By leveraging Natural Language Processing (NLP) techniques, public sentiment is extracted from online reviews, which is then combined with user-defined preferences such as specialty, years of experience, location, and consultation fees. Additionally, correlated attributes like education and certifications are incorporated to enhance the recommendation accuracy. Experimental results demonstrate that the proposed system significantly improves user satisfaction by providing personalized doctor recommendations that align with both public opinion and individual needs.

Keywords: sentiment analysis, online doctors, personal preferences, correlated attributes, recommendation system, healthcare, natural language processing

Procedia PDF Downloads 13
12162 Native Language Identification with Cross-Corpus Evaluation Using Social Media Data: ’Reddit’

Authors: Yasmeen Bassas, Sandra Kuebler, Allen Riddell

Abstract:

Native language identification is one of the growing subfields in natural language processing (NLP). The task of native language identification (NLI) is mainly concerned with predicting the native language of an author’s writing in a second language. In this paper, we investigate the performance of two types of features; content-based features vs. content independent features, when they are evaluated on a different corpus (using social media data “Reddit”). In this NLI task, the predefined models are trained on one corpus (TOEFL), and then the trained models are evaluated on different data using an external corpus (Reddit). Three classifiers are used in this task; the baseline, linear SVM, and logistic regression. Results show that content-based features are more accurate and robust than content independent ones when tested within the corpus and across corpus.

Keywords: NLI, NLP, content-based features, content independent features, social media corpus, ML

Procedia PDF Downloads 138
12161 “Presently”: A Personal Trainer App to Self-Train and Improve Presentation Skills

Authors: Shyam Mehraaj, Samanthi E. R. Siriwardana, Shehara A. K. G. H., Wanigasinghe N. T., Wandana R. A. K., Wedage C. V.

Abstract:

A presentation is a critical tool for conveying not just spoken information but also a wide spectrum of human emotions. The single most effective thing to make the presentation successful is to practice it beforehand. Preparing for a presentation has been shown to be essential for improving emotional control, intonation and prosody, pronunciation, and vocabulary, as well as the quality of the presentation slides. As a result, practicing has become one of the most critical parts of giving a good presentation. In this research, the main focus is to analyze the audio, video, and slides of the presentation uploaded by the presenters. This proposed solution is based on the Natural Language Processing and Computer Vision techniques to cater to the requirement for the presenter to do a presentation beforehand using a mobile responsive web application. The proposed system will assist in practicing the presentation beforehand by identifying the presenters’ emotions, body language, tonality, prosody, pronunciations and vocabulary, and presentation slides quality. Overall, the system will give a rating and feedback to the presenter about the performance so that the presenters’ can improve their presentation skills.

Keywords: presentation, self-evaluation, natural learning processing, computer vision

Procedia PDF Downloads 118
12160 Performance Evaluation of an Ontology-Based Arabic Sentiment Analysis

Authors: Salima Behdenna, Fatiha Barigou, Ghalem Belalem

Abstract:

Due to the quick increase in the volume of Arabic opinions posted on various social media, Arabic sentiment analysis has become one of the most important areas of research. Compared to English, there is very little works on Arabic sentiment analysis, in particular aspect-based sentiment analysis (ABSA). In ABSA, aspect extraction is the most important task. In this paper, we propose a semantic aspect-based sentiment analysis approach for standard Arabic reviews to extract explicit aspect terms and identify the polarity of the extracted aspects. The proposed approach was evaluated using HAAD datasets. Experiments showed that the proposed approach achieved a good level of performance compared with baseline results. The F-measure was improved by 19% for the aspect term extraction tasks and 55% aspect term polarity task.

Keywords: sentiment analysis, opinion mining, Arabic, aspect level, opinion, polarity

Procedia PDF Downloads 163
12159 Investigating the Use of English Arabic Codeswitching in EFL classroom Oral Discourse Case study: Middle school pupils of Ain Fekroun, Wilaya of Oum El Bouaghi Algeria

Authors: Fadila Hadjeris

Abstract:

The study aims at investigating the functions of English-Arabic code switching in English as a foreign language classroom oral discourse and the extent to which they can contribute to the flow of classroom interaction. It also seeks to understand the views, beliefs, and perceptions of teachers and learners towards this practice. We hypothesized that code switching is a communicative strategy which facilitates classroom interaction. Due to this fact, both teachers and learners support its use. The study draws on a key body of literature in bilingualism, second language acquisition, and classroom discourse in an attempt to provide a framework for considering the research questions. It employs a combination of qualitative and quantitative research methods which include classroom observations and questionnaires. The analysis of the recordings shows that teachers’ code switching to Arabic is not only used for academic and classroom management reasons. Rather, the data display instances in which code switching is used for social reasons. The analysis of the questionnaires indicates that teachers and pupils have different attitudes towards this phenomenon. Teachers reported their deliberate switching during EFL teaching, yet the majority was against this practice. According to them, the use of the mother has detrimental effects on the acquisition and the practice of the target language. In contrast, pupils showed their preference to their teachers’ code switching because it enhances and facilitates their understanding. These findings support the fact that the shift to pupils’ mother tongue is a strategy which aids and facilitates the teaching and the learning of the target language. This, in turn, necessitates recommendations which are suggested to teachers and course designers.

Keywords: bilingualism, codeswitching, classroom interaction, classroom discourse, EFL learning/ teaching, SLA

Procedia PDF Downloads 481
12158 Twitter Sentiment Analysis during the Lockdown on New-Zealand

Authors: Smah Almotiri

Abstract:

One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2020, until April 4, 2020. Natural language processing (NLP), which is a form of Artificial intelligence, was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applying machine learning sentimental methods such as Crystal Feel and extending the size of the sample tweet by using multiple tweets over a longer period of time.

Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS

Procedia PDF Downloads 191
12157 Reading in Multiple Arabic's: Effects of Diglossia and Orthography

Authors: Aula Khatteb Abu-Liel

Abstract:

The study investigated the effects of diglossia and orthography on reading in Arabic, manipulating reading in Spoken Arabic (SA), using Arabizi, in which it is written using Latin letters on computers/phones, and the two forms of the conventional written form Modern Standard Arabic (MSA): vowelled (shallow) and unvowelled (deep). 77 skilled readers in 8th grade performed oral reading of single words and narrative and expository texts, and silent reading comprehension of both genres of text. Oral reading and comprehension revealed different patterns. Single words and texts were read faster and more accurately in unvoweled MSA, slowest and least accurately in vowelled MSA, and in-between in Arabizi. Comprehension was highest for vowelled MSA. Narrative texts were better than expository texts in Arabizi with the opposite pattern in MSA. The results suggest that frequency of the type of texts and the way in which phonology is encoded affect skilled reading.

Keywords: Arabic, Arabize, computer mediated communication, diglossia, modern standard Arabic

Procedia PDF Downloads 164
12156 The Art of Contemporary Arabic Calligraphy in Oman: Salman Alhajri as an Example

Authors: Salman Amur Alhajri

Abstract:

Purpose: This paper explores the art of contemporary Arabic calligraphy in Oman. It explains the aesthetic features of Arabic calligraphy as a unique icon of Islamic art. This paper also explores the profile of one Omani artist, Salman Alhajri, as an example of Omani artists who have developed unique styles in this art stream. Methodology and approach: The paper is based on a theoretical study using a descriptive and case-study approach. Omani artists are fascinated by the art forms of Arabic calligraphy, which combine both spiritual meaning and aesthetic beauty. Artist Salman Alhajri is an example of a contemporary Arabic artist who uses Arabic calligraphy as the main theme in his art. Dr. Alhajri is trying to introduce the beauty of Arabic letters from a new aesthetic point of view. He also aims to create unusual visual effects that viewers can easily interact with. Even though words and phrases appear in Alhajri’s artwork, they are not conveying direct meanings: viewers can create their own meaning or expressions from them by appreciating the compositions of the artwork. Results: Arabic writing is directly related to the identity of Omani artists and their cultural background. This paper shows how the beauty of Arabic letters comes from its indefinite possibilities in designing calligraphic expressions, even within a single word, because letters can be stretched and transformed in various ways to create different compositions. Omani artists are interested in employing new media applications in this kind of practice to find new techniques for creating artwork based on Arabic writing. It is really important for all Omani artists to practice this art style because Arabic calligraphy and its flexibility introduce infinite possibilities that involve further exploration and investigation.

Keywords: Islamic art, contemporary Arabic calligraphy, new techniques, Omani artist

Procedia PDF Downloads 361
12155 Methodology for Developing an Intelligent Tutoring System Based on Marzano’s Taxonomy

Authors: Joaquin Navarro Perales, Ana Lidia Franzoni Velázquez, Francisco Cervantes Pérez

Abstract:

The Mexican educational system faces diverse challenges related with the quality and coverage of education. The development of Intelligent Tutoring Systems (ITS) may help to solve some of them by helping teachers to customize their classes according to the performance of the students in online courses. In this work, we propose the adaptation of a functional ITS based on Bloom’s taxonomy called Sistema de Apoyo Generalizado para la Enseñanza Individualizada (SAGE), to measure student’s metacognition and their emotional response based on Marzano’s taxonomy. The students and the system will share the control over the advance in the course, so they can improve their metacognitive skills. The system will not allow students to get access to subjects not mastered yet. The interaction between the system and the student will be implemented through Natural Language Processing techniques, thus avoiding the use of sensors to evaluate student’s response. The teacher will evaluate student’s knowledge utilization, which is equivalent to the last cognitive level in Marzano’s taxonomy.

Keywords: intelligent tutoring systems, student modelling, metacognition, affective computing, natural language processing

Procedia PDF Downloads 200
12154 Learning Programming for Hearing Impaired Students via an Avatar

Authors: Nihal Esam Abuzinadah, Areej Abbas Malibari, Arwa Abdulaziz Allinjawi, Paul Krause

Abstract:

Deaf and hearing-impaired students face many obstacles throughout their education, especially with learning applied sciences such as computer programming. In addition, there is no clear signs in the Arabic Sign Language that can be used to identify programming logic terminologies such as while, for, case, switch etc. However, hearing disabilities should not be a barrier for studying purpose nowadays, especially with the rapid growth in educational technology. In this paper, we develop an Avatar based system to teach computer programming to deaf and hearing-impaired students using Arabic Signed language with new signs vocabulary that is been developed for computer programming education. The system is tested on a number of high school students and results showed the importance of visualization in increasing the comprehension or understanding of concepts for deaf students through the avatar.

Keywords: hearing-impaired students, isolation, self-esteem, learning difficulties

Procedia PDF Downloads 145
12153 Language Processing of Seniors with Alzheimer’s Disease: From the Perspective of Temporal Parameters

Authors: Lai Yi-Hsiu

Abstract:

The present paper aims to examine the language processing of Chinese-speaking seniors with Alzheimer’s disease (AD) from the perspective of temporal cues. Twenty healthy adults, 17 healthy seniors, and 13 seniors with AD in Taiwan participated in this study to tell stories based on two sets of pictures. Nine temporal cues were fetched and analyzed. Oral productions in Mandarin Chinese were compared and discussed to examine to what extent and in what way these three groups of participants performed with significant differences. Results indicated that the age effects were significant in filled pauses. The dementia effects were significant in mean duration of pauses, empty pauses, filled pauses, lexical pauses, normalized mean duration of filled pauses and lexical pauses. The findings reported in the current paper help characterize the nature of language processing in seniors with or without AD, and contribute to the interactions between the AD neural mechanism and their temporal parameters.

Keywords: language processing, Alzheimer’s disease, Mandarin Chinese, temporal cues

Procedia PDF Downloads 447
12152 Probing Syntax Information in Word Representations with Deep Metric Learning

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, with the development of large-scale pre-trained lan-guage models, building vector representations of text through deep neural network models has become a standard practice for natural language processing tasks. From the performance on downstream tasks, we can know that the text representation constructed by these models contains linguistic information, but its encoding mode and extent are unclear. In this work, a structural probe is proposed to detect whether the vector representation produced by a deep neural network is embedded with a syntax tree. The probe is trained with the deep metric learning method, so that the distance between word vectors in the metric space it defines encodes the distance of words on the syntax tree, and the norm of word vectors encodes the depth of words on the syntax tree. The experiment results on ELMo and BERT show that the syntax tree is encoded in their parameters and the word representations they produce.

Keywords: deep metric learning, syntax tree probing, natural language processing, word representations

Procedia PDF Downloads 68
12151 Multilingualism without a Dominant Language in the Preschool Age: A Case of Natural Italian-Russian-German-English Multilingualism

Authors: Legkikh Victoria

Abstract:

The purpose of keeping bi/multilingualism is usually a way to let the child speak two/three languages at the same level. The main problem which normally appears is a mixed language or a domination of one language. The same level of two or more languages would be ideal but practically not easily reachable. So it was made an experiment with a girl with a natural multilingualism as an attempt to avoid a dominant language in the preschool age. The girl lives in Germany and the main languages for her are Italian, Russian and German but she also hears every day English. ‘One parent – one language’ strategy was used since the beginning so Italian and Russian were spoken to her since her birth, English was spoken between the parents and when she was 1,5 it was added German as a language of a nursery. In order to avoid a dominant language, she was always put in international groups with activity in different languages. Even if it was not possible to avoid an interference of languages in this case we can talk not only about natural multilingualism but also about balanced bilingualism in preschool time. The languages have been developing in parallel with different accents in a different period. Now at the age of 6 we can see natural horizontal multilingualism Russian/Italian/German/English. At the moment, her Russian/Italian bilingualism is balanced. German vocabulary is less but the language is active and English is receptive. We can also see a reciprocal interference of all the three languages (English is receptive so the simple phrases are normally said correctly but they are not enough to judge the level of language interference and it is not noticed any ‘English’ mistakes in other languages). After analysis of the state of every language, we can see as a positive and negative result of the experiment. As a positive result we can see that in the age of 6 the girl does not refuse any language, three languages are active, she differentiate languages and even if she says a word from another language she notifies that it is not a correct word, and the most important are the fact, that she does not have a preferred language. As a prove of the last statement it is to be noticed not only her self-identification as ‘half Russian and half Italian’ but also an answer to the question about her ‘mother tongue’: ‘I do not know, probably, when I have my own children I will speak one day Russian and one day Italian and sometimes German’. As a negative result, we can notice that not only a development of all the three languages are a little bit slower than it is supposed for her age but since she does not have a dominating language she also does not have a ‘perfect’ language and the interference is reciprocal. In any case, the experiment shows that it is possible to keep at least two languages without a preference in a pre-school multilingual space.

Keywords: balanced bilingualism, language interference, natural multilingualism, preschool multilingual education

Procedia PDF Downloads 273
12150 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 218
12149 Using Bidirectional Encoder Representations from Transformers to Extract Topic-Independent Sentiment Features for Social Media Bot Detection

Authors: Maryam Heidari, James H. Jones Jr.

Abstract:

Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event or product. However, this use raises an important question: what percentage of information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a bot, instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. In this paper, we introduce a model for social media bot detection which uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features. Our use of a Natural Language Processing approach to derive topic-independent features for our new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data as generated by a bot or a human, where the most accurate prior work achieved accuracy of 92\%.

Keywords: bot detection, natural language processing, neural network, social media

Procedia PDF Downloads 116
12148 DocPro: A Framework for Processing Semantic and Layout Information in Business Documents

Authors: Ming-Jen Huang, Chun-Fang Huang, Chiching Wei

Abstract:

With the recent advance of the deep neural network, we observe new applications of NLP (natural language processing) and CV (computer vision) powered by deep neural networks for processing business documents. However, creating a real-world document processing system needs to integrate several NLP and CV tasks, rather than treating them separately. There is a need to have a unified approach for processing documents containing textual and graphical elements with rich formats, diverse layout arrangement, and distinct semantics. In this paper, a framework that fulfills this unified approach is presented. The framework includes a representation model definition for holding the information generated by various tasks and specifications defining the coordination between these tasks. The framework is a blueprint for building a system that can process documents with rich formats, styles, and multiple types of elements. The flexible and lightweight design of the framework can help build a system for diverse business scenarios, such as contract monitoring and reviewing.

Keywords: document processing, framework, formal definition, machine learning

Procedia PDF Downloads 219
12147 Towards Logical Inference for the Arabic Question-Answering

Authors: Wided Bakari, Patrice Bellot, Omar Trigui, Mahmoud Neji

Abstract:

This article constitutes an opening to think of the modeling and analysis of Arabic texts in the context of a question-answer system. It is a question of exceeding the traditional approaches focused on morphosyntactic approaches. Furthermore, we present a new approach that analyze a text in order to extract correct answers then transform it to logical predicates. In addition, we would like to represent different levels of information within a text to answer a question and choose an answer among several proposed. To do so, we transform both the question and the text into logical forms. Then, we try to recognize all entailment between them. The results of recognizing the entailment are a set of text sentences that can implicate the user’s question. Our work is now concentrated on an implementation step in order to develop a system of question-answering in Arabic using techniques to recognize textual implications. In this context, the extraction of text features (keywords, named entities, and relationships that link them) is actually considered the first step in our process of text modeling. The second one is the use of techniques of textual implication that relies on the notion of inference and logic representation to extract candidate answers. The last step is the extraction and selection of the desired answer.

Keywords: NLP, Arabic language, question-answering, recognition text entailment, logic forms

Procedia PDF Downloads 343
12146 Arabicization and Terminology with Reference to Social Media Terms

Authors: Ahmed Al-Awthan

Abstract:

This study addresses the prevalence of English terminology in published Arabic documentation on social media. Although the problem of using English terms in translation instead of existing native ones has been addressed in general by researchers around the world, to the best of the author’s knowledge the attitude of the translators as professionals to this phenomenon in Qatar and Yemen has not received a detailed study. This study examines the impact of the use of English, social media terms in the Arab world on aspiring and professional translators; it explores the benefits and drawbacks of linguistic borrowing as identified by the translators and investigates whether translators consider any means of resisting linguistic borrowing and prioritizing Arabic. It also aims to answer the following questions: i. Is there any prevalence of English, social media terms in Arabic translation? Why or why not? ii. Do Arabic translators prefer using English, social media terms to their equivalents in Arabic? If so, why? iii. Which measures could be adopted to help reduce the frequently observed borrowing of English terms? In particular, how do translators see the role of the Arabic Language Academies in preserving Arabic? iv. This research is descriptive, comparative and analytical in nature. It is both qualitative and quantitative. To validate the problem, the researcher will analyze articles published by Al-Jazeera in 2016-2018 that refer to the use of social media in diplomacy. It will be examined whether the increased international discussion of political events in social media increased the amount of transliterated English terminology referring to this mode of communication.To investigate whether the translators recognize the phenomenon of borrowing, the researcher proposes to use a survey. This survey will use multiple choice questions. It will target 20 aspiring translators from Yemen and 20 participants from Qatar. It will offer 15 English, social media terms used in discourse in 15 sentences. For each sentence, the researcher will provide three different translations and will ask the translators to rate them and offer their rendition. After collecting all the answers online, the researcher will analyze the data. The results are expected to confirm whether there is a prevalence of English terms in translating into Arabic. It is also expected to show what measures the translators used to render the English, social media terms, and it raises awareness of borrowing English terms. It will guide the translator toward using Arabicization methods in order to contribute to preserving Arabic.

Keywords: Arabicization, trans lingual borrowing, social media terms, terminology

Procedia PDF Downloads 151
12145 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu

Authors: Ammarah Irum, Muhammad Ali Tahir

Abstract:

Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.

Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language

Procedia PDF Downloads 72
12144 The Linguistic Fingerprint in Western and Arab Judicial Applications

Authors: Asem Bani Amer

Abstract:

This study handles the linguistic fingerprint in judicial applications described in a law technicality that is recent and developing. It can be adopted to discover criminals by identifying their way of speaking and their special linguistic expressions. This is achieved by understanding the expression "linguistic fingerprint," its concept, and its extended domain, then revealing some of the linguistic fingerprint tools in Western judicial applications and deducing a technical imagination for a linguistic fingerprint in the Arabic language, which is needy for such judicial applications regarding this field, through dictionaries, language rhythm, and language structure.

Keywords: linguistic fingerprint, judicial, application, dictionary, picture, rhythm, structure

Procedia PDF Downloads 81