Search results for: sentence
198 Benchmarking Bert-Based Low-Resource Language: Case Uzbek NLP Models
Authors: Jamshid Qodirov, Sirojiddin Komolov, Ravilov Mirahmad, Olimjon Mirzayev
Abstract:
Nowadays, natural language processing tools play a crucial role in our daily lives, including various techniques with text processing. There are very advanced models in modern languages, such as English, Russian etc. But, in some languages, such as Uzbek, the NLP models have been developed recently. Thus, there are only a few NLP models in Uzbek language. Moreover, there is no such work that could show which Uzbek NLP model behaves in different situations and when to use them. This work tries to close this gap and compares the Uzbek NLP models existing as of the time this article was written. The authors try to compare the NLP models in two different scenarios: sentiment analysis and sentence similarity, which are the implementations of the two most common problems in the industry: classification and similarity. Another outcome from this work is two datasets for classification and sentence similarity in Uzbek language that we generated ourselves and can be useful in both industry and academia as well.Keywords: NLP, benchmak, bert, vectorization
Procedia PDF Downloads 52197 A Comparison between Bèi Passives and Yóu Passives in Mandarin Chinese
Authors: Rui-heng Ray Huang
Abstract:
This study compares the syntax and semantics of two kinds of passives in Mandarin Chinese: bèi passives and yóu passives. To express a Chinese equivalent for ‘The thief was taken away by the police,’ either bèi or yóu can be used, as in Xiǎotōu bèi/yóu jǐngchá dàizǒu le. It is shown in this study that bèi passives and yóu passives differ semantically and syntactically. The semantic observations are based on the theta theory, dealing with thematic roles. On the other hand, the syntactic analysis draws heavily upon the generative grammar, looking into thematic structures. The findings of this study are as follows. First, the core semantics of bèi passives is centered on the Patient NP in the subject position. This Patient NP is essentially an Affectee, undergoing the outcome or consequence brought up by the action represented by the predicate. This may explain why in the sentence Wǒde huà bèi/*yóu tā niǔqū le ‘My words have been twisted by him/her,’ only bèi is allowed. This is because the subject NP wǒde huà ‘my words’ suffers a negative consequence. Yóu passives, in contrast, place the semantic focus on the post-yóu NP, which is not an Affectee though. Instead, it plays a role which has to take certain responsibility without being affected in a way like an Affectee. For example, in the sentence Zhèbù diànyǐng yóu/*bèi tā dānrèn dǎoyǎn ‘This film is directed by him/her,’ only the use of yóu is possible because the post-yóu NP tā ‘s/he’ refers to someone in charge, who is not an Affectee, nor is the sentence-initial NP zhèbù diànyǐng ‘this film’. When it comes to the second finding, the syntactic structures of bèi passives and yóu passives differ in that the former involve a two-place predicate while the latter a three-place predicate. The passive morpheme bèi in a case like Xiǎotōu bèi jǐngchá dàizǒu le ‘The thief was taken away by the police’ has been argued by some Chinese syntacticians to be a two-place predicate which selects an Experiencer subject and an Event complement. Under this analysis, the initial NP xiǎotōu ‘the thief’ in the above example is a base-generated subject. This study, however, proposes that yóu passives fall into a three-place unergative structure. In the sentence Xiǎotōu yóu jǐngchá dàizǒu le ‘The thief was taken away by the police,’ the initial NP xiǎotōu ‘the thief’ is a topic which serves as a Patient taken by the verb dàizǒu ‘take away.’ The subject of the sentence is assumed to be an Agent, which is in a null form and may find its reference from the discourse or world knowledge. Regarding the post-yóu NP jǐngchá ‘the police,’ its status is dual. On the one hand, it is a Patient introduced by the light verb yóu; on the other, it is an Agent assigned by the verb dàizǒu ‘take away.’ It is concluded that the findings in this study contribute to better understanding of what makes the distinction between the two kinds of Chinese passives.Keywords: affectee, passive, patient, unergative
Procedia PDF Downloads 272196 Text Mining Techniques for Prioritizing Pathogenic Mutations in Protein Families Known to Misfold or Aggregate
Authors: Khaleel Saleh Al-Rababah
Abstract:
Amyloid fibril forming regions, which are known as protein aggregates, in sequences of some protein families are associated with a number of diseases known as amyloidosis. Mutations play a role in forming fibrils by accelerating the fibril formation process. In this paper we want to extract diseases that caused by those mutations as a result of the impact of the mutations on structural and functional properties of the aggregated protein. We propose a text mining system, to automatically extract mutations, diseases and relations between mutations and diseases. We presented an algorithm based on finite state to cluster mutations found in the same sentence as a sentence could contain different mutation cause different diseases. Also, we presented a co reference algorithm that enables cross-link sentences.Keywords: amyloid, amyloidosis, co reference, protein, text mining
Procedia PDF Downloads 521195 Verb Bias in Mandarin: The Corpus Based Study of Children
Authors: Jou-An Chung
Abstract:
The purpose of this study is to investigate the verb bias of the Mandarin verbs in children’s reading materials and provide the criteria for categorization. Verb bias varies cross-linguistically. As Mandarin and English are typological different, this study hopes to shed light on Mandarin verb bias with the use of corpus and provide thorough and detailed criteria for analysis. Moreover, this study focuses on children’s reading materials since it is a significant issue in understanding children’s sentence processing. Therefore, investigating verb bias of Mandarin verbs in children’s reading materials is also an important issue and can provide further insights into children’s sentence processing. The small corpus is built up for this study. The corpus consists of the collection of school textbooks and Mandarin Daily News for children. The files are then segmented and POS tagged by JiebaR (Chinese segmentation with R). For the ease of analysis, the one-word character verbs and intransitive verbs are excluded beforehand. The total of 20 high frequency verbs are hand-coded and are further categorized into one of the three types, namely DO type, SC type and other category. If the frequency of taking Other Type exceeds the threshold of 25%, the verb is excluded from the study. The results show that 10 verbs are direct object bias verbs, and six verbs are sentential complement bias verbs. The paired T-test was done to assure the statistical significance (p = 0.0001062 for DO bias verb, p=0.001149 for SC bias verb). The result has shown that in children’s reading materials, the DO biased verbs are used more than the SC bias verbs since the simplest structure of sentences is easier for children’s sentence comprehension or processing. In sum, this study not only discussed verb bias in child's reading materials but also provided basic coding criteria for verb bias analysis in Mandarin and underscored the role of context. Sentences are easier for children’s sentence comprehension or processing. In sum, this study not only discussed verb bias in child corpus, but also provided basic coding criteria for verb bias analysis in Mandarin and underscored the role of context.Keywords: corpus linguistics, verb bias, child language, psycholinguistics
Procedia PDF Downloads 290194 The Code-Mixing of Japanese, English, and Thai in Line Chat
Authors: Premvadee Na Nakornpanom
Abstract:
Language mixing in spontaneous speech has been widely discussed, but not in virtual situations; especially in context of the third language learning students. Thus, this study was an attempt to explore the characteristics of the mixing of Japanese, English and Thai in a mobile chat room by students with their background of Japanese, English, and Thai. The result found that Insertion of Thai and English content words was a very common linguistic phenomenon embedded in the utterances. As chatting is to be ‘relational’ or ‘interactional’, it affected the style of lexical choices to be speech-like, more personal and emotional-related. A Japanese sentence-final question particle“か”(ka) was added to the end of the sentence based on Thai grammar rule. Moreover, some unique characteristics were created. The non-verbal cues were represented in personal, Thai styles by inserting textual representations of images or feelings available on the websites into streams of conversations.Keywords: code-mixing, Japanese, English, Thai, line chat
Procedia PDF Downloads 650193 Methodologies for Deriving Semantic Technical Information Using an Unstructured Patent Text Data
Authors: Jaehyung An, Sungjoo Lee
Abstract:
Patent documents constitute an up-to-date and reliable source of knowledge for reflecting technological advance, so patent analysis has been widely used for identification of technological trends and formulation of technology strategies. But, identifying technological information from patent data entails some limitations such as, high cost, complexity, and inconsistency because it rely on the expert’ knowledge. To overcome these limitations, researchers have applied to a quantitative analysis based on the keyword technique. By using this method, you can include a technological implication, particularly patent documents, or extract a keyword that indicates the important contents. However, it only uses the simple-counting method by keyword frequency, so it cannot take into account the sematic relationship with the keywords and sematic information such as, how the technologies are used in their technology area and how the technologies affect the other technologies. To automatically analyze unstructured technological information in patents to extract the semantic information, it should be transformed into an abstracted form that includes the technological key concepts. Specific sentence structure ‘SAO’ (subject, action, object) is newly emerged by representing ‘key concepts’ and can be extracted by NLP (Natural language processor). An SAO structure can be organized in a problem-solution format if the action-object (AO) states that the problem and subject (S) form the solution. In this paper, we propose the new methodology that can extract the SAO structure through technical elements extracting rules. Although sentence structures in the patents text have a unique format, prior studies have depended on general NLP (Natural language processor) applied to the common documents such as newspaper, research paper, and twitter mentions, so it cannot take into account the specific sentence structure types of the patent documents. To overcome this limitation, we identified a unique form of the patent sentences and defined the SAO structures in the patents text data. There are four types of technical elements that consist of technology adoption purpose, application area, tool for technology, and technical components. These four types of sentence structures from patents have their own specific word structure by location or sequence of the part of speech at each sentence. Finally, we developed algorithms for extracting SAOs and this result offer insight for the technology innovation process by providing different perspectives of technology.Keywords: NLP, patent analysis, SAO, semantic-analysis
Procedia PDF Downloads 261192 Extending Image Captioning to Video Captioning Using Encoder-Decoder
Authors: Sikiru Ademola Adewale, Joe Thomas, Bolanle Hafiz Matti, Tosin Ige
Abstract:
This project demonstrates the implementation and use of an encoder-decoder model to perform a many-to-many mapping of video data to text captions. The many-to-many mapping occurs via an input temporal sequence of video frames to an output sequence of words to form a caption sentence. Data preprocessing, model construction, and model training are discussed. Caption correctness is evaluated using 2-gram BLEU scores across the different splits of the dataset. Specific examples of output captions were shown to demonstrate model generality over the video temporal dimension. Predicted captions were shown to generalize over video action, even in instances where the video scene changed dramatically. Model architecture changes are discussed to improve sentence grammar and correctness.Keywords: decoder, encoder, many-to-many mapping, video captioning, 2-gram BLEU
Procedia PDF Downloads 103191 'Infection in the Sentence': The Castration of a Black Woman's Dream of Authorship as Manifested in Buchi Emecheta's Second Class Citizen
Authors: Aseel Hatif Jassam, Hadeel Hatif Jassam
Abstract:
The paper discusses the phallocentric discourse that is challenged by women in general and of women of color in particular in spite of the simultaneity of oppression due to race, class, and gender in the diaspora. Therefore, the paper gives a brief account of women's experience in the light of postcolonial feminist theory. The paper also cast light on the theories of Luce Irigaray and Helen Cixous, two Feminist theorists who support and advise women to have their own discourse to challenge the infectious patriarchal sentence advocated by Sigmund Freud and Harold Bloom's model of literary history. Black women authors like BuchiEmecheta as well as her alter ego Adah, a Nigerian-born girl and the protagonist of her semi-autobiographical novel, Second Class Citizen, suffer from this phallocentric and oppressive sentence and displacement as they migrate from Nigeria, a former British colony where they feel marginalized to North London with the hope of realizing their dreams. Yet, in the British diaspora, they get culturally shocked and continue to suffer from further marginalization due to class and race and are insulted and interiorized ironically by their patriarchal husbands who try to put an end to their dreams of authorship. With the phallocentric belief that women aren't capable of self-representation in the background of their mindsets, the violent Sylvester Onwordi and Francis Obi, the husbands of both Emecheta and Adah, respectively have practiced oppression on them by burning their own authoritative voice, represented by the novels they write while they are struggling with their economically atrocious living experience in the British diaspora.Keywords: authorship, British diaspora, discourse, phallocentric, patriarchy
Procedia PDF Downloads 175190 Life Imprisonment: European Convention on Human Rights Standards and the New Serbian Criminal Code
Authors: Veljko Turanjanin
Abstract:
In this article, an author deals with the issue of life imprisonment. Life imprisonment represents a new sentence in the Serbian legislature, in addition to the standard one, imprisonment. The author elaborated on judgments of the European Court of Human Rights (ECtHR), imposing the possibility of parole for the person sentenced to life imprisonment, emphasizing rehabilitation as the primary goal of penalties. According to the ECtHR, life imprisonment without parole is not permitted. The right to rehabilitation is very strictly set in the ECtHR jurisprudence. Life imprisonment represents a new sentence in the Serbian legislature, in addition to the standard one, imprisonment. The legislator provided the possibility of parole for most criminal offenses after 27 years in prison, while for some of them, a possibility of parole is explicitly prohibited. The author points out the shortcomings of the legal solution that exists in Serbia, which flagrantly threatens to violate the human rights of the offenders.Keywords: European Court of Human Rights, life imprisonment, parole, rehabilitation
Procedia PDF Downloads 103189 Factors Affecting English Language Acquisition and Learning for Primary Schools in Nigeria
Authors: Chibuzor Dalmeida
Abstract:
This paper shall discuss the factors affecting English Language Acquisition and Learning for Primary School in Nigeria. Learning English language is a difficult task mostly those at the primary school level. Pupils find it more difficult on vocabulary, grammar and sentence structure, idioms, pronunciation etc. Researchers have discovered the reasons behind these discrepancies and have formulated theories that could be of utmost assistance to English language teachers and students. This paper further looked at the following factors that include Learner Characteristics and Personal Traits, Situational and Environmental Factors, Prior Language Development and Competence and Age and Brain Development. It further recommended that pupils must learn new vocabulary, rules for grammar and sentence structure, idioms, pronunciation. Pupils whose families and communities set high standards for language acquisition learn more quickly than those who do not. Exposure to high-quality programs also essential. Pupils do best when they are allowed to speak their native language.Keywords: acquisition, affecting, factors, learning
Procedia PDF Downloads 626188 Agents and Causers in the Experiencer-Verb Lexicon
Authors: Margaret Ryan, Linda Cupples, Lyndsey Nickels, Paul Sowman
Abstract:
The current investigation explored the thematic roles of the nouns specified in the lexical entries of experiencer verbs. While prior experimental research assumes experiencer and theme roles for both subject-experiencer (SE) and object-experiencer (OE) verbs, syntactic theorists have posited additional agent and causer roles. Experiment 1 provided evidence for an agent as participants assigned a high degree of intentionality to the logical subject of a subset of SE and OE actives and passives. Experiment 2 provided evidence for a causer as participants assigned high levels of causality to the logical subjects of experiencer sentences generally. However, the presence of an agent, but not a causer, coincided with processing ease. Causality may be an aspect rather than a thematic role. The varying thematic roles amongst experiencer-verb sentences have important implications for stimulus selection because we cannot presume processing is similar across differing sentence subtypes.Keywords: sentence comprehension, lexicon, canonicity, processing, thematic roles, syntax
Procedia PDF Downloads 122187 Controversies Connected with the Admission of Illegally Gained Evidences in Polish Civil Proceedings
Authors: Aleksandra Czubak
Abstract:
The need to present evidence in civil proceedings is essential for getting the right result. It is for this reason that it is particularly important for the parties to present the most relevant and convincing evidence to the Court. Therefore, parties often try to gain evidence, even when the acquisition of such evidence is in breach of the law. Firstly, there will be discussed how evidence is applied in the Polish civil process and the Polish regulations of the evidence proceedings; with specific reference to evidence of major importance in the developing world. Further, it will be discussed the controversies connected with the admission of illegally gained evidence in civil proceedings. The credibility of the various measures is circumstantial and can only be determined by factors related to the recognized problem. For that reason, it is not the amount of evidence, but the value and relevance of this evidence that should be considered in determining the right result. This paper will also consider whether the end justifies the means? How far should parties go in order to achieve a favorable sentence or to create stronger evidence? Methods of persuasion of the court, as well as the acquisition of evidence, are not always fair and moral. It is on this area of controversy that this essay will focus. This paper concludes by considering the value of evidence and the possibility of using it to achieve a just sentence. Examples are based on Polish law; nevertheless, they encompass ideas common to most civil jurisdictions.Keywords: civil proceedings, Europe (Poland), evidence, law
Procedia PDF Downloads 248186 Chinese Sentence Level Lip Recognition
Authors: Peng Wang, Tigang Jiang
Abstract:
The computer based lip reading method of different languages cannot be universal. At present, for the research of Chinese lip reading, whether the work on data sets or recognition algorithms, is far from mature. In this paper, we study the Chinese lipreading method based on machine learning, and propose a Chinese Sentence-level lip-reading network (CNLipNet) model which consists of spatio-temporal convolutional neural network(CNN), recurrent neural network(RNN) and Connectionist Temporal Classification (CTC) loss function. This model can map variable-length sequence of video frames to Chinese Pinyin sequence and is trained end-to-end. More over, We create CNLRS, a Chinese Lipreading Dataset, which contains 5948 samples and can be shared through github. The evaluation of CNLipNet on this dataset yielded a 41% word correct rate and a 70.6% character correct rate. This evaluation result is far superior to the professional human lip readers, indicating that CNLipNet performs well in lipreading.Keywords: lipreading, machine learning, spatio-temporal, convolutional neural network, recurrent neural network
Procedia PDF Downloads 126185 The Academic Achievement of Writing via Project-Based Learning
Authors: Duangkamol Thitivesa
Abstract:
This paper focuses on the use of project work as a pretext for applying the conventions of writing, or the correctness of mechanics, usage, and sentence formation, in a content-based class in a Rajabhat University. Its aim was to explore to what extent the student teachers’ academic achievement of the basic writing features against the 70% attainment target after the use of project is. The organization of work around an agreed theme in which the students reproduce language provided by texts and instructors is expected to enhance students’ correct writing conventions. The sample of the study comprised of 38 fourth-year English major students. The data was collected by means of achievement test and student writing works. The scores in the summative achievement test were analyzed by mean score, standard deviation, and percentage. It was found that the student teachers do more achieve of practicing mechanics and usage, and less in sentence formation. The students benefited from the exposure to texts during conducting the project; however, their automaticity of how and when to form phrases and clauses into simple/complex sentences had room for improvement.Keywords: project-based learning, project work, writing conventions, academic achievement
Procedia PDF Downloads 332184 Understanding the Interactive Nature in Auditory Recognition of Phonological/Grammatical/Semantic Errors at the Sentence Level: An Investigation Based upon Japanese EFL Learners’ Self-Evaluation and Actual Language Performance
Authors: Hirokatsu Kawashima
Abstract:
One important element of teaching/learning listening is intensive listening such as listening for precise sounds, words, grammatical, and semantic units. Several classroom-based investigations have been conducted to explore the usefulness of auditory recognition of phonological, grammatical and semantic errors in such a context. The current study reports the results of one such investigation, which targeted auditory recognition of phonological, grammatical, and semantic errors at the sentence level. 56 Japanese EFL learners participated in this investigation, in which their recognition performance of phonological, grammatical and semantic errors was measured on a 9-point scale by learners’ self-evaluation from the perspective of 1) two types of similar English sound (vowel and consonant minimal pair words), 2) two types of sentence word order (verb phrase-based and noun phrase-based word orders), and 3) two types of semantic consistency (verb-purpose and verb-place agreements), respectively, and their general listening proficiency was examined using standardized tests. A number of findings have been made about the interactive relationships between the three types of auditory error recognition and general listening proficiency. Analyses based on the OPLS (Orthogonal Projections to Latent Structure) regression model have disclosed, for example, that the three types of auditory error recognition are linked in a non-linear way: the highest explanatory power for general listening proficiency may be attained when quadratic interactions between auditory recognition of errors related to vowel minimal pair words and that of errors related to noun phrase-based word order are embraced (R2=.33, p=.01).Keywords: auditory error recognition, intensive listening, interaction, investigation
Procedia PDF Downloads 511183 Referencing Anna: Findings From Eye-tracking During Dutch Pronoun Resolution
Authors: Robin Devillers, Chantal van Dijk
Abstract:
Children face ambiguities in everyday language use. Particularly ambiguity in pronoun resolution can be challenging, whereas adults can rapidly identify the antecedent of the mentioned pronoun. Two main factors underlie this process, namely the accessibility of the referent and the syntactic cues of the pronoun. After 200ms, adults have converged the accessibility and the syntactic constraints, while relieving cognitive effort by considering contextual cues. As children are still developing their cognitive capacity, they are not able yet to simultaneously assess and integrate accessibility, contextual cues and syntactic information. As such, they fail to identify the correct referent and possibly fixate more on the competitor in comparison to adults. In this study, Dutch while-clauses were used to investigate the interpretation of pronouns by children. The aim is to a) examine the extent to which 7-10 year old children are able to utilise discourse and syntactic information during online and offline sentence processing and b) analyse the contribution of individual factors, including age, working memory, condition and vocabulary. Adult and child participants are presented with filler-items and while-clauses, and the latter follows a particular structure: ‘Anna and Sophie are sitting in the library. While Anna is reading a book, she is taking a sip of water.’ This sentence illustrates the ambiguous situation, as it is unclear whether ‘she’ refers to Anna or Sophie. In the unambiguous situation, either Anna or Sophie would be substituted by a boy, such as ‘Peter’. The pronoun in the second sentence will unambiguously refer to one of the characters due to the syntactic constraints of the pronoun. Children’s and adults’ responses were measured by means of a visual world paradigm. This paradigm consisted of two characters, of which one was the referent (the target) and the other was the competitor. A sentence was presented and followed by a question, which required the participant to choose which character was the referent. Subsequently, this paradigm yields an online (fixations) and offline (accuracy) score. These findings will be analysed using Generalised Additive Mixed Models, which allow for a thorough estimation of the individual variables. These findings will contribute to the scientific literature in several ways; firstly, the use of while-clauses has not been studied much and it’s processing has not yet been identified. Moreover, online pronoun resolution has not been investigated much in both children and adults, and therefore, this study will contribute to adults and child’s pronoun resolution literature. Lastly, pronoun resolution has not been studied yet in Dutch and as such, this study adds to the languagesKeywords: pronouns, online language processing, Dutch, eye-tracking, first language acquisition, language development
Procedia PDF Downloads 96182 EduEasy: Smart Learning Assistant System
Authors: A. Karunasena, P. Bandara, J. A. T. P. Jayasuriya, P. D. Gallage, J. M. S. D. Jayasundara, L. A. P. Y. P. Nuwanjaya
Abstract:
Usage of smart learning concepts has increased rapidly all over the world recently as better teaching and learning methods. Most educational institutes such as universities are experimenting those concepts with their students. Smart learning concepts are especially useful for students to learn better in large classes. In large classes, the lecture method is the most popular method of teaching. In the lecture method, the lecturer presents the content mostly using lecture slides, and the students make their own notes based on the content presented. However, some students may find difficulties with the above method due to various issues such as speed in delivery. The purpose of this research is to assist students in large classes in the following content. The research proposes a solution with four components, namely note-taker, slide matcher, reference finder, and question presenter, which are helpful for the students to obtain a summarized version of the lecture note, easily navigate to the content and find resources, and revise content using questions.Keywords: automatic summarization, extractive text summarization, speech recognition library, sentence extraction, automatic web search, automatic question generator, sentence scoring, the term weight
Procedia PDF Downloads 143181 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks
Authors: Jiajun Wang, Xiaoge Li
Abstract:
The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose a new aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.Keywords: aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree
Procedia PDF Downloads 212180 Tibyan Automated Arabic Correction Using Machine-Learning in Detecting Syntactical Mistakes
Authors: Ashwag O. Maghraby, Nida N. Khan, Hosnia A. Ahmed, Ghufran N. Brohi, Hind F. Assouli, Jawaher S. Melibari
Abstract:
The Arabic language is one of the most important languages. Learning it is so important for many people around the world because of its religious and economic importance and the real challenge lies in practicing it without grammatical or syntactical mistakes. This research focused on detecting and correcting the syntactic mistakes of Arabic syntax according to their position in the sentence and focused on two of the main syntactical rules in Arabic: Dual and Plural. It analyzes each sentence in the text, using Stanford CoreNLP morphological analyzer and machine-learning approach in order to detect the syntactical mistakes and then correct it. A prototype of the proposed system was implemented and evaluated. It uses support vector machine (SVM) algorithm to detect Arabic grammatical errors and correct them using the rule-based approach. The prototype system has a far accuracy 81%. In general, it shows a set of useful grammatical suggestions that the user may forget about while writing due to lack of familiarity with grammar or as a result of the speed of writing such as alerting the user when using a plural term to indicate one person.Keywords: Arabic language acquisition and learning, natural language processing, morphological analyzer, part-of-speech
Procedia PDF Downloads 150179 English Theticity and Focus Expression in Spanish Heritage Speakers
Authors: Sebastian Leal-Arenas
Abstract:
English uses in-situ Nuclear Stress (NS) to express the meanings of theticity and focus. The NS is phonetically represented by an increase in duration, intensity, and pitch range. On the other hand, Spanish conveys the same meanings by aligning the constituent that carries the NS to the end of the sentence via word-order movement. However, little is known about heritage speakers’ production of theticity and focus in English or Spanish. The present study investigates heritage speakers’ production of thetic and subject focus statements. Participants (n = 11) were heritage speakers of Spanish with varying proficiency enrolled in a writing course at a university in the United States. In the production task, participants observed contextualized images and uttered a sentence to answer a provided question. Duration, intensity, and F0 peak were the correlates to stress considered in this investigation. Results indicated that participants tended to present an intonation closer to what is expected in English monolinguals in subject-focus statements than in thetic sentences. However, participants with lower Spanish proficiency used in-situ NS placement in thetic statements more often than those with higher proficiency. Results are discussed in terms of the production patterns observed in heritage speakers with emphasis on the role of language dominance.Keywords: focus, heritage speakers, prosody, theticity
Procedia PDF Downloads 69178 Prevalence of Hinglish on the Indian English News Channels and Its Impact on the New Language Learners: A Qualitative Analysis
Authors: Swatantra
Abstract:
Hinglish, a blended version of Hindi and English, emerged due to the lack of the competence and command of the speakers over the foreign language, i., e., English. But, amazingly, the trend has gained wide acceptance. In India, this acceptance has gone up to the extent that popular news anchors at the prime time shows are frequently using it. At the moment, instead of being considered a flaw of their presentation Hinglish is emerging as a trendy genre. Its pervasive usage and extensive acceptance is motivating youngsters to opt for the similar kind of patterns. The current study is an endeavour to assess the impact of this trend on the new language learners. With the help of semi-structured interviews, the researcher has tried to gauge the level of comfort and desire to be at par with the other fluent English speakers. The results clearly depict a substantiated boost in the confidence level of learners because they are able to use the vocabulary and sentence patterns of their own choice and convenience. The prevalence and acceptance of the trend in the main stream media have really served as a catalyst and the desire to be at par with the other fluent speakers is also fading away. The users of Hinglish find this trend to be closer to their heart as in the earlier times in the absence of exact translation they had to compromise with the meaning or spirit of the word/phrase / sentence. But now enhanced flexibility is leaving them more comfortable and confident.Keywords: Hinglish, language learners, linguistic trends, media
Procedia PDF Downloads 154177 English Grammatical Errors of Arabic Sentence Translations Done by Machine Translations
Authors: Muhammad Fathurridho
Abstract:
Grammar as a rule used by every language to be understood by everyone is always related to syntax and morphology. Arabic grammar is different with another languages’ grammars. It has more rules and difficulties. This paper aims to investigate and describe the English grammatical errors of machine translation systems in translating Arabic sentences, including declarative, exclamation, imperative, and interrogative sentences, specifically in year 2018 which can be supported with artificial intelligence’s role. The Arabic sample sentences which are divided into two; verbal and nominal sentence of several Arabic published texts will be examined as the source language samples. The translated sentences done by several popular online machine translation systems, including Google Translate, Microsoft Bing, Babylon, Facebook, Hellotalk, Worldlingo, Yandex Translate, and Tradukka Translate are the material objects of this research. Descriptive method that will be taken to finish this research will show the grammatical errors of English target language, and classify them. The conclusion of this paper has showed that the grammatical errors of machine translation results are varied and generally classified into morphological, syntactical, and semantic errors in all type of Arabic words (Noun, Verb, and Particle), and it will be one of the evaluations for machine translation’s providers to correct them in order to improve their understandable results.Keywords: Arabic, Arabic-English translation, machine translation, grammatical errors
Procedia PDF Downloads 153176 Preliminary Study of the Phonological Development in Three and Four Year Old Bulgarian Children
Authors: Tsvetomira Braynova, Miglena Simonska
Abstract:
The article presents the results of research on phonological processes in three and four-year-old children. For the purpose of the study, an author's test was developed and conducted among 120 children. The study included three areas of research - at the level of words (96 words), at the level of sentence repetition (10 sentences) and at the level of generating own speech from a picture (15 pictures). The test also gives us additional information about the articulation errors of the assessed children. The main purpose of the icing is to analyze all phonological processes that occur at this age in Bulgarian children and to identify which are typical and atypical for this age. The results show that the most common phonology errors that children make are: sound substitution, an elision of sound, metathesis of sound, elision of a syllable, and elision of consonants clustered in a syllable. All examined children were identified with the articulatory disorder from type bilabial lambdacism. Measuring the correlation between the average length of repeated speech and the average length of generated speech, the analysis proves that the more words a child can repeat in part “repeated speech,” the more words they can be expected to generate in part “generating sentence.” The results of this study show that the task of naming a word provides sufficient and representative information to assess the child's phonology.Keywords: assessment, phonology, articulation, speech-language development
Procedia PDF Downloads 179175 Correction of Frequent English Writing Errors by Using Coded Indirect Corrective Feedback and Error Treatment
Authors: Chaiwat Tantarangsee
Abstract:
The purposes of this study are: 1) to study the frequent English writing errors of students registering the course: Reading and Writing English for Academic Purposes II, and 2) to find out the results of writing error correction by using coded indirect corrective feedback and writing error treatments. Samples include 28 2nd year English Major students, Faculty of Education, Suan Sunandha Rajabhat University. Tool for experimental study includes the lesson plan of the course; Reading and Writing English for Academic Purposes II, and tool for data collection includes 4 writing tests of short texts. The research findings disclose that frequent English writing errors found in this course comprise 7 types of grammatical errors, namely Fragment sentence, Subject-verb agreement, Wrong form of verb tense, Singular or plural noun endings, Run-ons sentence, Wrong form of verb pattern and Lack of parallel structure. Moreover, it is found that the results of writing error correction by using coded indirect corrective feedback and error treatment reveal the overall reduction of the frequent English writing errors and the increase of students’ achievement in the writing of short texts with the significance at .05.Keywords: coded indirect corrective feedback, error correction, error treatment, frequent English writing errors
Procedia PDF Downloads 235174 The Impact of Breast Cancer Diagnosis on Omani Women
Authors: H. Al-Awaisi, M. H. Al-Azri, S. Al-Rasbi, M. Al-Moundhri
Abstract:
Breast cancer is the most common cancer among females worldwide. It is also the most common cancer among females in Oman with 100 new breast cancer cases diagnosed every year. It has been found that breast cancer have a devastating effect on women’s life. Women diagnosed with breast cancer might develop negative attitudes towards the illness and their bodies. They might also suffer from psychological ailments such as depression. Despite the evidence on the impact of breast cancer diagnosis on women, there was no study found to explore the impact of breast cancer diagnosis among women in Oman. A phenomenological qualitative study was conducted to explore the impact of breast cancer diagnosis on Omani women. Data was collected through semi-structured individual interviews with 11 Omani women diagnosed with breast cancer. Interviews were transcribed verbatim and data were analyzed thematically. From the data, there are four main themes identified in relation to the impact of cancer diagnosis on Omani women. These are 'shock and disbelieve', 'a death sentence', “uncertain future” and “social stigma”. At the time of interviews, all participants had advanced breast cancer with some participants having metastatic disease. The impact of the word “cancer” had a profound and catastrophic effect on the women and their close relatives. In conclusion, breast cancer diagnosis was shocking and mainly perceived as a death sentence by Omani women with uncertain future and social stigma. Regardless of age, maternal status and education level, it is evident that Omani women participated in this study lacked awareness about breast cancer diagnosis, treatment and prognosis.Keywords: breast cancer, coping, diagnosis, Oman, women
Procedia PDF Downloads 504173 Pragmatic Development of Chinese Sentence Final Particles via Computer-Mediated Communication
Authors: Qiong Li
Abstract:
This study investigated in which condition computer-mediated communication (CMC) could promote pragmatic development. The focal feature included four Chinese sentence final particles (SFPs), a, ya, ba, and ne. They occur frequently in Chinese, and function as mitigators to soften the tone of speech. However, L2 acquisition of SFPs is difficult, suggesting the necessity of additional exposure to or explicit instruction on Chinese SFPs. This study follows this line and aims to explore two research questions: (1) Is CMC combined with data-driven instruction more effective than CMC alone in promoting L2 Chinese learners’ SFP use? (2) How does L2 Chinese learners’ SFP use change over time, as compared to the production of native Chinese speakers? The study involved 19 intermediate-level learners of Chinese enrolled at a private American university. They were randomly assigned to two groups: (1) the control group (N = 10), which was exposed to SFPs through CMC alone, (2) the treatment group (N = 9), which was exposed to SFPs via CMC and data-driven instruction. Learners interacted with native speakers on given topics through text-based CMC over Skype. Both groups went through six 30-minute CMC sessions on a weekly basis, with a one-week interval after the first two CMC sessions and a two-week interval after the second two CMC sessions (nine weeks in total). The treatment group additionally received a data-driven instruction after the first two sessions. Data analysis focused on three indices: token frequency, type frequency, and acceptability of SFP use. Token frequency was operationalized as the raw occurrence of SFPs per clause. Type frequency was the range of SFPs. Acceptability was rated by two native speakers using a rating rubric. The results showed that the treatment group made noticeable progress over time on the three indices. The production of SFPs approximated the native-like level. In contrast, the control group only slightly improved on token frequency. Only certain SFPs (a and ya) reached the native-like use. Potential explanations for the group differences were discussed in two aspects: the property of Chinese SFPs and the role of CMC and data-driven instruction. Though CMC provided the learners with opportunities to notice and observe SFP use, as a feature with low saliency, SFPs were not easily noticed in input. Data-driven instruction in the treatment group directed the learners’ attention to these particles, which facilitated the development.Keywords: computer-mediated communication, data-driven instruction, pragmatic development, second language Chinese, sentence final particles
Procedia PDF Downloads 415172 Correction of Frequent English Writing Errors by Using Coded Indirect Corrective Feedback and Error Treatment: The Case of Reading and Writing English for Academic Purposes II
Authors: Chaiwat Tantarangsee
Abstract:
The purposes of this study are 1) to study the frequent English writing errors of students registering the course: Reading and Writing English for Academic Purposes II, and 2) to find out the results of writing error correction by using coded indirect corrective feedback and writing error treatments. Samples include 28 2nd year English Major students, Faculty of Education, Suan Sunandha Rajabhat University. Tool for experimental study includes the lesson plan of the course; Reading and Writing English for Academic Purposes II, and tool for data collection includes 4 writing tests of short texts. The research findings disclose that frequent English writing errors found in this course comprise 7 types of grammatical errors, namely Fragment sentence, Subject-verb agreement, Wrong form of verb tense, Singular or plural noun endings, Run-ons sentence, Wrong form of verb pattern and Lack of parallel structure. Moreover, it is found that the results of writing error correction by using coded indirect corrective feedback and error treatment reveal the overall reduction of the frequent English writing errors and the increase of students’ achievement in the writing of short texts with the significance at .05.Keywords: coded indirect corrective feedback, error correction, error treatment, English writing
Procedia PDF Downloads 303171 Perceiving Casual Speech: A Gating Experiment with French Listeners of L2 English
Authors: Naouel Zoghlami
Abstract:
Spoken-word recognition involves the simultaneous activation of potential word candidates which compete with each other for final correct recognition. In continuous speech, the activation-competition process gets more complicated due to speech reductions existing at word boundaries. Lexical processing is more difficult in L2 than in L1 because L2 listeners often lack phonetic, lexico-semantic, syntactic, and prosodic knowledge in the target language. In this study, we investigate the on-line lexical segmentation hypotheses that French listeners of L2 English form and then revise as subsequent perceptual evidence is revealed. Our purpose is to shed further light on the processes of L2 spoken-word recognition in context and better understand L2 listening difficulties through a comparison of skilled and unskilled reactions at the point where their working hypothesis is rejected. We use a variant of the gating experiment in which subjects transcribe an English sentence presented in increments of progressively greater duration. The spoken sentence was “And this amazing athlete has just broken another world record”, chosen mainly because it included common reductions and phonetic features in English, such as elision and assimilation. Our preliminary results show that there is an important difference in the manner in which proficient and less-proficient L2 listeners handle connected speech. Less-proficient listeners delay recognition of words as they wait for lexical and syntactic evidence to appear in the gates. Further statistical results are currently being undertaken.Keywords: gating paradigm, spoken word recognition, online lexical segmentation, L2 listening
Procedia PDF Downloads 462170 The Phonology and Phonetics of Second Language Intonation in Case of “Downstep”
Authors: Tayebeh Norouzi
Abstract:
This study aims to investigate the acquisition process of intonation. It examines the intonation structure of Tokyo Japanese and its realization by Iranian learners of Japanese. Seven Iranian learners of Japanese, differing in fluency, and two Japanese speakers participated in the experiment. Two sentences were used to test the phonological and phonetic characteristics of lexical pitch-accent as well as the intonation patterns produced by the speakers. Both sentences consisted of similar words with the same number of syllables and lexical pitch-accents but different syntactic structure. Speakers were asked to read each sentence three times at normal speed, and the data were analyzed by Praat. The results show that lexical pitch-accent, Accentual Phrase (AP) and AP boundary tone realization vary depending on sentence type. For sentences of type XdeYwo, the lexical pitch-accent is realized properly. However, there is a rise in AP boundary tone regardless of speakers’ level of fluency. In contrast, in sentences of type XnoYwo, the lexical pitch-accent and AP boundary tone vary depending on the speakers’ fluency level. Advanced speakers are better at grouping words into phrases and produce more native-like intonation patterns, though they are not able to realize downstep properly. The non-native speakers tried to realize proper intonation patterns by making changes in lexical accent and boundary tone.Keywords: intonation, Iranian learners, Japanese prosody, lexical accent, second language acquisition.
Procedia PDF Downloads 169169 Culture of Writing and Writing of Culture: Organizational Connections and Pedagogical Implications of ESL Writing in Multilingual Philippine Setting
Authors: Randy S. Magdaluyo, Lea M. Cabar, Jefferson Q. Correa
Abstract:
One recurring issue in ESL writing is the confusing differences in the writing conventions of the first language and the target language. Culture may play an intriguing role in specifying writing features and structures that ESL writers have to follow. Although writing is typically organized in a three-part structure with introduction, body, and conclusion, it is important to analyze the complex nature of ESL writing. This study investigated the organizational features and structures of argumentative essays written in English by thirty college ESL students from three linguistic backgrounds (Cebuano, Chavacao, and Tausug) in a Philippine university. The nature of word order and sentence construction in the students’ essays and the specific components of the introduction, body, and conclusion were quantitatively and qualitatively analyzed based on ESL writing models. Focus group discussions were also conducted to help clarify the possible influence of students’ first language on the ways their essays were conceptualized and organized. Results indicate that while there was no significant difference in the overall introduction, body, and conclusion in all essays, the sentence length was interestingly different for each linguistic group of ESL students, and the word order was notably inconsistent with the S-V-O pattern of the target language. The first language was also revealed to have a facilitative role in the cognitive translation process of these ESL students. As such, implications for a multicultural writing pedagogy was discussed and recommended considering both the students’ native resources in their first language and the ESL writing models in their target language.Keywords: community funds of knowledge, contrastive rhetoric, ESL writing, multicultural writing pedagogy
Procedia PDF Downloads 134