Search results for: corpus of spoken Lithuanian
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 634

Search results for: corpus of spoken Lithuanian

484 Phonology and Syntax of Article Incorporation in Mauritian Creole: Evidence from Bantou Languages

Authors: Emmanuel Nikiema

Abstract:

This paper examines article incorporation in Mauritian Creole, a French Lexifier Creole which exhibits three forms of article incorporation as illustrated in (1-3). While various analyses of article incorporation have been proposed in the literature, fewer studies have explored the motivation of this widespread phenomenon in Mauritian Creole (MC) as opposed to other French Lexifier Creoles spoken in the Caribbean. For example, Mauritian Creole exhibits 4 times more CV incorporation than Haitian Creole, and 40 times more than Reunion Creole. (1) Consonantal type (C): loraz ‘thunder storm’, lete ‘summer’, zwazo ‘bird’, nide ‘idea’. (2) Syllabic type (CV): lapo ‘skin’, liku ‘neck’, ledo ‘back’, leker ‘heart’, diber ‘butter’. (3) Bi-consonantal (CVC): delo ‘water’, dizef ‘egg’, lizye ‘eye’, dilwil ‘oil’. The goal of this study is twofold: 1) uncover the rules governing the three types of article incorporation in MC, and 2) account for its remarkable occurrence in MC as opposed to its quasi-absence in Reunion Creole. We have collected a corpus of over 700 cases and organized it into three categories (C; CV and CVC). For example, there are 471 examples of CV incorporation in MC against 112 in Haitian Creole and only 12 in Reunion Creole. Two questions can be raised: 1) what is the motivation and distribution of the three types of incorporation in MC, and 2) how can one account for the high volume of incorporation in MC as opposed to its quasi-absence in Reunion Creole? We suggest that article incorporation in MC is related to the structure of nouns in Bantou languages. While previous authors have largely used population settlement data in the colonies during the Creole formation period to justify their analyses, we propose an account based on the syntactic structure of Bantou nouns. This analysis will shed light on the contribution of African languages to the formation of MC, and on to why MC has exhibited more article incorporation cases than any other French Lexifier Creole.

Keywords: article incorporation, creole languages, description, phonology

Procedia PDF Downloads 87
483 Named Entity Recognition System for Tigrinya Language

Authors: Sham Kidane, Fitsum Gaim, Ibrahim Abdella, Sirak Asmerom, Yoel Ghebrihiwot, Simon Mulugeta, Natnael Ambassager

Abstract:

The lack of annotated datasets is a bottleneck to the progress of NLP in low-resourced languages. The work presented here consists of large-scale annotated datasets and models for the named entity recognition (NER) system for the Tigrinya language. Our manually constructed corpus comprises over 340K words tagged for NER, with over 118K of the tokens also having parts-of-speech (POS) tags, annotated with 12 distinct classes of entities, represented using several types of tagging schemes. We conducted extensive experiments covering convolutional neural networks and transformer models; the highest performance achieved is 88.8% weighted F1-score. These results are especially noteworthy given the unique challenges posed by Tigrinya’s distinct grammatical structure and complex word morphologies. The system can be an essential building block for the advancement of NLP systems in Tigrinya and other related low-resourced languages and serve as a bridge for cross-referencing against higher-resourced languages.

Keywords: Tigrinya NER corpus, TiBERT, TiRoBERTa, BiLSTM-CRF

Procedia PDF Downloads 73
482 Adjectives in Academic Discourse: A Comparative Study of Research Articles

Authors: Beata Grymska

Abstract:

The research studies on academic discourse focus in general on lexical bundles, epistemic modality markers, or interactions between writers and readers. Following the research into the written forms of the academic community, this study concentrates on adjectives in research articles. The study investigates the distribution of adjectives in research articles in two academic disciplines: linguistics and medicine. It is corpus-based in design and consists of 100 linguistic and 100 medical research articles all written in English. The aim of the study is to compare the distribution of adjectives between the two corpora and four main parts of articles: IMRD (Introduction, Methods, Results, and Discussion). The second aim is to see if the two corpora share common core adjectives, e.g., different, important, specific, and if there are discipline-specific adjectives. The further part of the paper elaborates on adjectives use in the corpora together with examples. The results indicate that the two corpora do not differ in the distribution of adjectives to a great extent. The occurrences of the most frequently used adjectives depend on the academic discipline of the research articles. The concluding part reflects upon the role of adjectives in academic discourse and also presents how corpora can be helpful in composing academic texts.

Keywords: academic discourse, academic texts, adjectives, corpus analysis, research articles

Procedia PDF Downloads 161
481 Concepts of Creation and Destruction as Cognitive Instruments in World View Study

Authors: Perizat Balkhimbekova

Abstract:

Evolutionary changes in cognitive world view taking place in the last decades are followed by changes in perception of the key concepts which are related to the certain lingua-cultural sphere. Also, such concepts reflect the person’s attitude to essential processes in the sphere of concepts, e.g. the opposite operations like creation and destruction. These changes in people’s life and thinking are displayed in a language world view. In order to open the maintenance of mental structures and concepts we should use language means as observable results of people’s cognitive activity. Semantics of words, free phrases and idioms should be considered as an authoritative source of information concerning concepts. The regularized set of concepts in people consciousness forms the sphere of concepts. Cognitive linguistics widely discusses the sphere of concepts as its crucial category defining it as the field of knowledge which is made of concepts. It is considered that a sphere of concepts comprises the various types of association and forms conceptual fields. As a material for the given research, the data from Russian National Corpus and British National Corpus were used. In is necessary to point out that data provided by computational studies, are intrinsic and verifiable; so that we have used them in order to get the reliable results. The procedure of study was based on such techniques as extracting of the context containing concepts of creation|destruction from the Russian National Corpus (RNC), and British National Corpus (BNC); analyzing and interpreting of those context on the basis of cognitive approach; finding of correspondence between the given concepts in the Russian and English world view. The key problem of our study is to find the correspondence between the elements of world view represented by opposite concepts such as creation and destruction. Findings: The concept of "destruction" indicates a process which leads to full or partial destruction of an object. In other words, it is a loss of the object primary essence: structures, properties, distinctive signs and its initial integrity. The concept of "creation", on the contrary, comprises positive characteristics, represents the activity aimed at improvement of the certain object, at the creation of ideal models of the world. On the other hand, destruction is represented much more widely in RNC than creation (1254 cases of the first concept by comparison to 192 cases for the second one). Our hypothesis consists in the antinomy represented by the aforementioned concepts. Being opposite both in respect of semantics and pragmatics, and from the point of view of axiology, they are at the same time complementary and interrelated concepts.

Keywords: creation, destruction, concept, world view

Procedia PDF Downloads 322
480 'Wandering Uterus': An Analogy of Perception of Women in Hippocratic Corpus and Post-Modern Times

Authors: Ankita Sharma

Abstract:

The study proposes to review the perception of women in the Classical Age (500-336 BC) when Greek Philosophy was in bloom. It was observed that women had very few rights and were still under the control of men. One of the possible reasons for this exclusion was woman’s biology that had a huge influence on her being seen as inferior to men. The text ‘Hippocratic Corpus’ focuses on the biological construct of the female body in classical Greek science that perpetuated the idea of women as second-class citizens and were considered inherently weaker than men. The research highlights the significance of the text that was used to encourage women of that time to get married and produce children and how till today the perception remains the same. The Greek belief of need for confinement and control of 'wandering uterus' has led to superior understanding of men. The pivotal emphasis of this research is to women and their bodies that are depicted in a misogynistic way which paved the way for Hippocratic writers to influence the society’s attitude towards women in their writings. It is intended to draw attention to the prevailing cultural assumptions and preconceived notions about female anatomy that had a pervasive influence in the following centuries with its roots being in ancient science.

Keywords: classical Greek theory, women, wandering womb, modern ideology

Procedia PDF Downloads 169
479 A Corpus-Based Contrastive Analysis of Directive Speech Act Verbs in English and Chinese Legal Texts

Authors: Wujian Han

Abstract:

In the process of human interaction and communication, speech act verbs are considered to be the most active component and the main means for information transmission, and are also taken as an indication of the structure of linguistic behavior. The theoretical value and practical significance of such everyday built-in metalanguage have long been recognized. This paper, which is part of a bigger study, is aimed to provide useful insights for a more precise and systematic application to speech act verbs translation between English and Chinese, especially with regard to the degree to which generic integrity is maintained in the practice of translation of legal documents. In this study, the corpus, i.e. Chinese legal texts and their English translations, English legal texts, ordinary Chinese texts, and ordinary English texts, serve as a testing ground for examining contrastively the usage of English and Chinese directive speech act verbs in legal genre. The scope of this paper is relatively wide and essentially covers all directive speech act verbs which are used in ordinary English and Chinese, such as order, command, request, prohibit, threat, advice, warn and permit. The researcher, by combining the corpus methodology with a contrastive perspective, explored a range of characteristics of English and Chinese directive speech act verbs including their semantic, syntactic and pragmatic features, and then contrasted them in a structured way. It has been found that there are similarities between English and Chinese directive speech act verbs in legal genre, such as similar semantic components between English speech act verbs and their translation equivalents in Chinese, formal and accurate usage of English and Chinese directive speech act verbs in legal contexts. But notable differences have been identified in areas of difference between their usage in the original Chinese and English legal texts such as valency patterns and frequency of occurrences. For example, the subjects of some directive speech act verbs are very frequently omitted in Chinese legal texts, but this is not the case in English legal texts. One of the practicable methods to achieve adequacy and conciseness in speech act verb translation from Chinese into English in legal genre is to repeat the subjects or the message with discrepancy, and vice versa. In addition, translation effects such as overuse and underuse of certain directive speech act verbs are also found in the translated English texts compared to the original English texts. Legal texts constitute a particularly valuable material for speech act verb study. Building up such a contrastive picture of the Chinese and English speech act verbs in legal language would yield results of value and interest to legal translators and students of language for legal purposes and have practical application to legal translation between English and Chinese.

Keywords: contrastive analysis, corpus-based, directive speech act verbs, legal texts, translation between English and Chinese

Procedia PDF Downloads 455
478 English Language Teaching Graduate Students' Use of Discussion Moves in Research Articles

Authors: Gamzegul Koca, Evrim Eveyik-Aydin

Abstract:

Genre and discipline-specific knowledge of academic discourse in writing has long been acknowledged as being a core skill to achieve formidable tasks that are expected of graduate students in academic settings. Genre analysis approaches can be adopted to unveil the challenges encountered in these tasks to be able to take instructional actions addressing the aspects of graduate writing that need improvement. In an attempt to find genre-specific academic writing needs of Turkish students enrolled in a graduate program in ELT, this study examines the rhetorical structure of discussion sections of research articles written during the course load stage of their graduate studies. The 35.437-word specialized corpus of graduate papers compiled for the purpose of the study includes discussions of 58 unpublished reports of empirical studies, 31 written in MA courses and 27 in Ph.D. courses by a total of 44 graduate students. The study does sentence-based move structure analysis using the framework developed by Eveyik-Aydın, Karabacak and Akyel in a corpus-based study that analyzed the discussion moves of expert writers in published articles in ELT journals indexed by Social Sciences Citation. The coding of 1577 sentences by three graders using this framework revealed that while the graduate papers included the same moves used in published articles, the rhetorical structure of MA and Ph.D. papers showed considerable differences in terms of the frequency of occurrence of main discussion moves, including interpretation of the results and drawing implications. The implications of these findings will be discussed with respect to the needs of graduate writers and the expectations of discourse community.

Keywords: discussion moves, genre-specific rhetorical structure, move analysis, research articles, the specialized corpus of graduate papers

Procedia PDF Downloads 148
477 Agenesis of the Corpus Callosum: The Role of Neuropsychological Assessment with Implications to Psychosocial Rehabilitation

Authors: Ron Dick, P. S. D. V. Prasadarao, Glenn Coltman

Abstract:

Agenesis of the corpus callosum (ACC) is a failure to develop corpus callosum - the large bundle of fibers of the brain that connects the two cerebral hemispheres. It can occur as a partial or complete absence of the corpus callosum. In the general population, its estimated prevalence rate is 1 in 4000 and a wide range of genetic, infectious, vascular, and toxic causes have been attributed to this heterogeneous condition. The diagnosis of ACC is often achieved by neuroimaging procedures. Though persons with ACC can perform normally on intelligence tests they generally present with a range of neuropsychological and social deficits. The deficit profile is characterized by poor coordination of motor movements, slow reaction time, processing speed and, poor memory. Socially, they present with deficits in communication, language processing, the theory of mind, and interpersonal relationships. The present paper illustrates the role of neuropsychological assessment with implications to psychosocial management in a case of agenesis of the corpus callosum. Method: A 27-year old left handed Caucasian male with a history of ACC was self-referred for a neuropsychological assessment to assist him in his employment options. Parents noted significant difficulties with coordination and balance at an early age of 2-3 years and he was diagnosed with dyspraxia at the age of 14 years. History also indicated visual impairment, hypotonia, poor muscle coordination, and delayed development of motor milestones. MRI scan indicated agenesis of the corpus callosum with ventricular morphology, widely spaced parallel lateral ventricles and mild dilatation of the posterior horns; it also showed colpocephaly—a disproportionate enlargement of the occipital horns of the lateral ventricles which might be affecting his motor abilities and visual defects. The MRI scan ruled out other structural abnormalities or neonatal brain injury. At the time of assessment, the subject presented with such problems as poor coordination, slowed processing speed, poor organizational skills and time management, and difficulty with social cues and facial expressions. A comprehensive neuropsychological assessment was planned and conducted to assist in identifying the current neuropsychological profile to facilitate the formulation of a psychosocial and occupational rehabilitation programme. Results: General intellectual functioning was within the average range and his performance on memory-related tasks was adequate. Significant visuospatial and visuoconstructional deficits were evident across tests; constructional difficulties were seen in tasks such as copying a complex figure, building a tower and manipulating blocks. Poor visual scanning ability and visual motor speed were evident. Socially, the subject reported heightened social anxiety, difficulty in responding to cues in the social environment, and difficulty in developing intimate relationships. Conclusion: Persons with ACC are known to present with specific cognitive deficits and problems in social situations. Findings from the current neuropsychological assessment indicated significant visuospatial difficulties, poor visual scanning and problems in social interactions. His general intellectual functioning was within the average range. Based on the findings from the comprehensive neuropsychological assessment, a structured psychosocial rehabilitation programme was developed and recommended.

Keywords: agenesis, callosum, corpus, neuropsychology, psychosocial, rehabilitation

Procedia PDF Downloads 260
476 Changes in Chromatographically Assessed Fatty Acid Profile during Technology of Dairy Products

Authors: Lina Lauciene, Vaida Andruleviciute, Ingrida Sinkeviciene, Mindaugas Malakauskas, Loreta Serniene

Abstract:

Dairy product manufacturers constantly are looking for new markets for their production. And in most cases, the problem of product compliance with the composition requirements of foreign products is highlighted. This is especially true of the composition of milk fat in dairy products. It is well known that there are many factors such as feeding ratio, season, cow breed, stage of lactation that affect the fatty acid composition in milk. However, there is less evidence on the impact of the technological process on the composition of fatty acids in raw milk and products made from it. In this study the influence of the technological process on fat composition in 82% fat butter, 15% fat curd, 3.6% fat yogurt and 2.5% fat UHT milk was determined. The samples were collected at each stage of production, starting with raw milk and ending with the final product in the Lithuanian milk-processing company. Fatty acids methyl esters were quantified using a GC (Clarus 680, Perkin Elmer) equipped with flame ionization detector (FID) and a capillary column SP-2560, 100 m x 0.25 mm id x 0.20 µm. Fatty acids peaks were identified using Supelco® 37 Component FAME Mix. The concentration of each fatty acid was expressed in percent of the total fatty acid amount. In the case of UHT milk production, it was compared raw milk, cream, milk mixture, and UHT milk but significant differences were not estimated between these stages. Analyzing stages of the yogurt production (raw milk, pasteurized milk, and milk with a starter culture and yogurt), no significant changes were detected between stages as well. A slight difference was observed with C4:0 - a percentage of this fatty acid was less (p=0.053) in the final stage than in milk with the starter culture. During butter production, the composition of fatty acids in raw cream, buttermilk, and butter did not change significantly. Only C14:0 decreased in the butter then compared to buttermilk. The curd fatty acid analysis showed the increase of C6:0, C8:0, C10:0, C11:0, C12:0 C14:0 and C17:0 at the final stage when compared to raw milk, cream, milk mixture, and whey. Meantime the increase of C18:1n9c (in comparison with milk mixture and curd) and C18:2n6c (in comparison with raw milk, milk mixture, and curd) was estimated in cream. The results of this study suggest that the technological process did not affect the composition of fatty acids in UHT milk, yogurt, butter, and curd but had the impact on the concentration of individual fatty acids. In general, all of the fatty acids from the raw milk were converted into the final product, only some of them slightly changed the concentration. Therefore, in order to ensure an appropriate composition of certain fatty acids in the final product, producers must carefully choose the raw milk. Acknowledgment: This research was funded by Lithuanian Ministry of Agriculture (No. MT-17-13).

Keywords: dairy products, fat composition, fatty acids, technological process

Procedia PDF Downloads 145
475 The Effects of Culture and Language on Social Impression Formation from Voice Pleasantness: A Study with French and Iranian People

Authors: L. Bruckert, A. Mansourzadeh

Abstract:

The voice has a major influence on interpersonal communication in everyday life via the perception of pleasantness. The evolutionary perspective postulates that the mechanisms underlying the pleasantness judgments are universal adaptations that have evolved in the service of choosing a mate (through the process of sexual selection). From this point of view, the favorite voices would be those with more marked sexually dimorphic characteristics; for example, in men with lower voice pitch, pitch is the main criterion. On the other hand, one can postulate that the mechanisms involved are gradually established since childhood through exposure to the environment, and thus the prosodic elements could take precedence in everyday life communication as it conveys information about the speaker's attitude (willingness to communicate, interest toward the interlocutors). Our study focuses on voice pleasantness and its relationship with social impression formation, exploring both the spectral aspects (pitch, timbre) and the prosodic ones. In our study, we recorded the voices through two vocal corpus (five vowels and a reading text) of 25 French males speaking French and 25 Iranian males speaking Farsi. French listeners (40 male/40 female) listened to the French voices and made a judgment either on the voice's pleasantness or on the speaker (judgment about his intelligence, honesty, sociability). The regression analyses from our acoustic measures showed that the prosodic elements (for example, the intonation and the speech rate) are the most important criteria concerning pleasantness, whatever the corpus or the listener's gender. Moreover, the correlation analyses showed that the speakers with the voices judged as the most pleasant are considered the most intelligent, sociable, and honest. The voices in Farsi have been judged by 80 other French listeners (40 male/40 female), and we found the same effect of intonation concerning the judgment of pleasantness with the corpus «vowel» whereas with the corpus «text» the pitch is more important than the prosody. It may suggest that voice perception contains some elements invariant across culture/language, whereas others are influenced by the cultural/linguistic background of the listener. Shortly in the future, Iranian people will be asked to listen either to the French voices for half of them or to the Farsi voices for the other half and produce the same judgments as the French listeners. This experimental design could potentially make it possible to distinguish what is linked to culture and what is linked to language in the case of differences in voice perception.

Keywords: cross-cultural psychology, impression formation, pleasantness, voice perception

Procedia PDF Downloads 45
474 Linguistic Cyberbullying, a Legislative Approach

Authors: Simona Maria Ignat

Abstract:

Bullying online has been an increasing studied topic during the last years. Different approaches, psychological, linguistic, or computational, have been applied. To our best knowledge, a definition and a set of characteristics of phenomenon agreed internationally as a common framework are still waiting for answers. Thus, the objectives of this paper are the identification of bullying utterances on Twitter and their algorithms. This research paper is focused on the identification of words or groups of words, categorized as “utterances”, with bullying effect, from Twitter platform, extracted on a set of legislative criteria. This set is the result of analysis followed by synthesis of law documents on bullying(online) from United States of America, European Union, and Ireland. The outcome is a linguistic corpus with approximatively 10,000 entries. The methods applied to the first objective have been the following. The discourse analysis has been applied in identification of keywords with bullying effect in texts from Google search engine, Images link. Transcription and anonymization have been applied on texts grouped in CL1 (Corpus linguistics 1). The keywords search method and the legislative criteria have been used for identifying bullying utterances from Twitter. The texts with at least 30 representations on Twitter have been grouped. They form the second corpus linguistics, Bullying utterances from Twitter (CL2). The entries have been identified by using the legislative criteria on the the BoW method principle. The BoW is a method of extracting words or group of words with same meaning in any context. The methods applied for reaching the second objective is the conversion of parts of speech to alphabetical and numerical symbols and writing the bullying utterances as algorithms. The converted form of parts of speech has been chosen on the criterion of relevance within bullying message. The inductive reasoning approach has been applied in sampling and identifying the algorithms. The results are groups with interchangeable elements. The outcomes convey two aspects of bullying: the form and the content or meaning. The form conveys the intentional intimidation against somebody, expressed at the level of texts by grammatical and lexical marks. This outcome has applicability in the forensic linguistics for establishing the intentionality of an action. Another outcome of form is a complex of graphemic variations essential in detecting harmful texts online. This research enriches the lexicon already known on the topic. The second aspect, the content, revealed the topics like threat, harassment, assault, or suicide. They are subcategories of a broader harmful content which is a constant concern for task forces and legislators at national and international levels. These topic – outcomes of the dataset are a valuable source of detection. The analysis of content revealed algorithms and lexicons which could be applied to other harmful contents. A third outcome of content are the conveyances of Stylistics, which is a rich source of discourse analysis of social media platforms. In conclusion, this corpus linguistics is structured on legislative criteria and could be used in various fields.

Keywords: corpus linguistics, cyberbullying, legislation, natural language processing, twitter

Procedia PDF Downloads 57
473 Identification of Information War in Lithuania

Authors: Vitalijus Leibenka

Abstract:

After 2014 the world of Russia’s actions in annexing Crimea has seen a hybrid war that has helped Russia achieve its goals. The world and NATO nations have pointed out that hybrid action can help achieve not only military but also economic and political goals. One of the weapons of action in hybrid warfare is information warfare tools, the use of which helps to carry out actions in the context of hybrid warfare as a whole. In addition, information war tools can be used alone, over time and for long-term purposes. Although forms of information war, such as propaganda and disinformation, have been used in the past, in old conflicts and wars, new forms of information war have emerged as a result of technological development, making the dissemination of information faster and more efficient. The world understands that information is becoming a weapon, but not everyone understands that both information war and information warfare differ in their essence and full content. In addition, the damage and impact of the use of information war, which may have worse consequences than a brief military conflict, is underestimated. Lithuania is also facing various interpretations of the information war. Some believe that the information attack is an information war and the understanding of the information war is limited to a false message in the press. Others, however, deepen and explain the essence of the information war. Society has formed in such a way that not all people are able to assess the threats of information war, to separate information war from information attack. Recently, the Lithuanian government has been taking measures in the context of the information war, making decisions that allow the development of the activities of the state and state institutions in order to create defense mechanisms in the information war. However, this is happening rather slowly and incompletely. Every military conflict, related to Lithuania in one way or another, forces Lithuanian politicians to take up the theme of information warfare again. As a result, a national cyber security center is being set up, and Russian channels spreading lies are banned. However, there is no consistent development and continuous improvement of action against information threats. Although a sufficiently influential part of society (not a political part) helps to stop the spread of obscure information by creating social projects such as “Demaskuok” and “Laikykis ten su Andriumi tapinu”, it goes without saying that it will not become a key tool in the fight against information threats. Therefore, in order to achieve clean dissemination of information in Lithuania, full-fledged and substantial political decisions are necessary, the adoption of which would change the public perception of the information war, its damage, impact and actions that would allow to combat the spread. Political decisions should cover the educational, military, economic and political areas, which are one of the main and most important in the state, which would allow to fundamentally change the situation against the background of information war.

Keywords: information war, information warfare, hybrid war, hybrid warfare, NATO, Lithuania, Russia

Procedia PDF Downloads 39
472 Grammatical and Lexical Explorations on ‘Outer Circle’ Englishes and ‘Expanding Circle’ Englishes: A Corpus-Based Comparative Analysis

Authors: Orlyn Joyce D. Esquivel

Abstract:

This study analyzed 50 selected research papers from professional language and linguistic academic journals to portray the differences between Kachru’s (1994) outer circle and expanding circle Englishes. The selected outer circle Englishes include those of Bangladesh, Malaysia, the Philippines, India, and Singapore; and the selected expanding circle Englishes are those of China, Indonesia, Japan, Korea, and Thailand. The researcher built ten corpora (five research papers for each corpus) to represent each variety of Englishes. The corpora were examined under grammatical and lexical features using Modified English TreeTagger in Sketch Engine. Results revealed the distinct grammatical and lexical features through the table and textual analyses, illustrated from the most to least dominant linguistic elements. In addition, comparative analyses were done to distinguish the features of each of the selected Englishes. The Language Change Theory was used as a basis in the discussion. Hence, the findings suggest that the ‘outer circle’ Englishes and ‘expanding circle’ Englishes will continue to drift from International English.

Keywords: applied linguistics, English as a global language, expanding circle Englishes, global Englishes, outer circle Englishes

Procedia PDF Downloads 129
471 Reflection of Landscape Agrogenization in the Soil Cover Structure and Profile Morphology: Example of Lithuania Agroecosystem

Authors: Jonas Volungevicius, Kristina Amaleviciute, Rimantas Vaisvalavicius, Alvyra Slepetiene, Darijus Veteikis

Abstract:

Lithuanian territory is characterized by landscape with prevailing morain hills and clayey lowlands. The largest part of it has endured agrogenization of various degrees which was the cause of changes both in the structure of landscape and soil cover, transformations of soil profile and degradation of natural background soils. These changes influence negatively geoecological potential of landscape and soil and contribute to the weakening of the sustainability of agroecosystems. Research objective: to reveal the landscape agrogenization induced alterations of catenae and their appendant soil profiles in Lithuanian morain hills and clayey lowlands. Methods: Soil cover analysis and catenae charting was conducted using landscape profiling; soil morphology detected and soil type identified following WRB 2014. Granulometric composition of soil profiles was obtained by laser diffraction method (lazer diffractometer Mastersizer 2000). pH was measured in H2O extraction using potentiometric titration; SOC was determined by the Tyurin method modified by Nikitin, measuring with spectrometer Cary 50 (VARIAN) in 590 nm wavelength using glucose standards. Results: analysis showed that the decrease of forest vegetation and the other natural landscape components following the agrogenization of the research area influenced differently but significantly the structural alterations in soil cover and vertical soil profile. The research detected that due to landscape agrogenization, the suppression of zone-specific processes and the intensification of inter-zone processes determined by agrogenic factors take place in Lithuanian agroecosystems. In forested hills historically prevailing Retisols and Histosols territorial complex is transforming into the territorial complex of Regosols, Deluvial soils and drained Histosols. Processes taking place are simplification of vertical profile structure, intensive rejuvenation of profile, disappearance of the features of zone-specific soil-forming processes (podzolization, lessivage, gley formation). Erosion and deluvial processes manifest more intensively and weakly accumulating organic material more intensively spread in a vertical soil profile. The territorial soil complex of Gleyic Luvisols and Gleysols dominating in forested clayey lowlands subjected to agrogenization is transformed into the catena of drained Luvisols and pseudo Cambisols. The best expressed are their changes in moisture regime (morphological features of gley and stagnic properties are on decline) together with alterations of pH and distribution and intensity of accumulation of organic matter in profile. A specific horizon, antraquic, uncharacteristic to natural soil formation is appearing. Important to note that due to deep ploughing and other agrotechnical measures, the natural vertical differentiation of clay particles in a soil profile is destroyed which leads not only to alterations of physical qualities of soil, but also encumbers the identification of Luvisols by creating presumptions to misidentify them as Cambisols. The latter have never developed in these ecosystems under the present climatic conditions. Acknowledgements: This work was supported by the National Science Program: The effect of long-term, different-intensity management of resources on the soils of different genesis and on other components of the agro-ecosystems [grant number SIT-9/2015] funded by the Research Council of Lithuania.

Keywords: agroecosystems, landscape agrogenization, luvisols, retisols, transformation of soil profile

Procedia PDF Downloads 234
470 Affects Associations Analysis in Emergency Situations

Authors: Joanna Grzybowska, Magdalena Igras, Mariusz Ziółko

Abstract:

Association rule learning is an approach for discovering interesting relationships in large databases. The analysis of relations, invisible at first glance, is a source of new knowledge which can be subsequently used for prediction. We used this data mining technique (which is an automatic and objective method) to learn about interesting affects associations in a corpus of emergency phone calls. We also made an attempt to match revealed rules with their possible situational context. The corpus was collected and subjectively annotated by two researchers. Each of 3306 recordings contains information on emotion: (1) type (sadness, weariness, anxiety, surprise, stress, anger, frustration, calm, relief, compassion, contentment, amusement, joy) (2) valence (negative, neutral, or positive) (3) intensity (low, typical, alternating, high). Also, additional information, that is a clue to speaker’s emotional state, was annotated: speech rate (slow, normal, fast), characteristic vocabulary (filled pauses, repeated words) and conversation style (normal, chaotic). Exponentially many rules can be extracted from a set of items (an item is a previously annotated single information). To generate the rules in the form of an implication X → Y (where X and Y are frequent k-itemsets) the Apriori algorithm was used - it avoids performing needless computations. Then, two basic measures (Support and Confidence) and several additional symmetric and asymmetric objective measures (e.g. Laplace, Conviction, Interest Factor, Cosine, correlation coefficient) were calculated for each rule. Each applied interestingness measure revealed different rules - we selected some top rules for each measure. Owing to the specificity of the corpus (emergency situations), most of the strong rules contain only negative emotions. There are though strong rules including neutral or even positive emotions. Three examples of the strongest rules are: {sadness} → {anxiety}; {sadness, weariness, stress, frustration} → {anger}; {compassion} → {sadness}. Association rule learning revealed the strongest configurations of affects (as well as configurations of affects with affect-related information) in our emergency phone calls corpus. The acquired knowledge can be used for prediction to fulfill the emotional profile of a new caller. Furthermore, a rule-related possible context analysis may be a clue to the situation a caller is in.

Keywords: data mining, emergency phone calls, emotional profiles, rules

Procedia PDF Downloads 387
469 The Contribution of Corpora to the Investigation of Cross-Linguistic Equivalence in Phraseology: A Contrastive Analysis of Russian and Italian Idioms

Authors: Federica Floridi

Abstract:

The long tradition of contrastive idiom research has essentially been focusing on three domains: the comparison of structural types of idioms (e.g. verbal idioms, idioms with noun-phrase structure, etc.), the description of idioms belonging to the same thematic groups (Sachgruppen), the identification of different types of cross-linguistic equivalents (i.e. full equivalents, partial equivalents, phraseological parallels, non-equivalents). The diastratic, diachronic and diatopic aspects of the compared idioms, as well as their syntactic, pragmatic and semantic properties, have been rather ignored. Corpora (both monolingual and parallel) give the opportunity to investigate the actual use of correlating idioms in authentic texts of L1 and L2. Adopting the corpus-based approach, it is possible to draw attention to the frequency of occurrence of idioms, their syntactic embedding, their potential syntactic transformations (e.g., nominalization, passivization, relativization, etc.), their combinatorial possibilities, the variations of their lexical structure, their connotations in terms of stylistic markedness or register. This paper aims to present the results of a contrastive analysis of Russian and Italian idioms referring to the concepts of ‘beginning’ and ‘end’, that has been carried out by using the Russian National Corpus and the ‘La Repubblica’ corpus. Beyond the digital corpora, bilingual dictionaries, like Skvorcova - Majzel’, Dobrovol’skaja, Kovalev, Čerdanceva, as well as monolingual resources, have been consulted. The study has shown that many of the idioms that have been traditionally indicated as cross-linguistic equivalents on bilingual dictionaries cannot be considered correspondents. The findings demonstrate that even those idioms, that are formally identical in Russian and Italian and are presumably derived from the same source (e.g., conceptual metaphor, Bible, classical mythology, World literature), exhibit differences regarding usage. The ultimate purpose of this article is to highlight that it is necessary to review and improve the existing bilingual dictionaries considering the empirical data collected in corpora. The materials gathered in this research can contribute to this sense.

Keywords: corpora, cross-linguistic equivalence, idioms, Italian, Russian

Procedia PDF Downloads 119
468 Compensatory Articulation of Pressure Consonants in Telugu Cleft Palate Speech: A Spectrographic Analysis

Authors: Indira Kothalanka

Abstract:

For individuals born with a cleft palate (CP), there is no separation between the nasal cavity and the oral cavity, due to which they cannot build up enough air pressure in the mouth for speech. Therefore, it is common for them to have speech problems. Common cleft type speech errors include abnormal articulation (compensatory or obligatory) and abnormal resonance (hyper, hypo and mixed nasality). These are generally resolved after palate repair. However, in some individuals, articulation problems do persist even after the palate repair. Such individuals develop variant articulations in an attempt to compensate for the inability to produce the target phonemes. A spectrographic analysis is used to investigate the compensatory articulatory behaviours of pressure consonants in the speech of 10 Telugu speaking individuals aged between 7-17 years with a history of cleft palate. Telugu is a Dravidian language which is spoken in Andhra Pradesh and Telangana states in India. It is a language with the third largest number of native speakers in India and the most spoken Dravidian language. The speech of the informants is analysed using single word list, sentences, passage and conversation. Spectrographic analysis is carried out using PRAAT, speech analysis software. The place and manner of articulation of consonant sounds is studied through spectrograms with the help of various acoustic cues. The types of compensatory articulation identified are glottal stops, palatal stops, uvular, velar stops and nasal fricatives which are non-native in Telugu.

Keywords: cleft palate, compensatory articulation, spectrographic analysis, PRAAT

Procedia PDF Downloads 421
467 Multilingualism and the Creation of New Languages: The Case of Camfranglais Spoken in Italy and Germany

Authors: Jocelyne Kenne Kenne

Abstract:

Previous works in the field of sociolinguistics have explored the various outcomes of linguistic pluralism. One of these outcomes is the creation of new languages. The presentation will focus on one of such languages, Camfranglais, a hybrid language spoken by Cameroonians. It appeared in the 1970s in the francophone area in Cameroon and developed as a result of interactions between French, English, Cameroonian Pidgin English and local Cameroonian languages, all languages spoken in Cameroon. With the migration of Cameroonians to Europe, researches have been conducted to analyze the sociolinguistic profile of Cameroonians in their new environment. The emphasis on this presentation will be on two recent studies that have been conducted to analyze the peculiarity of Camfranglais in two European countries: Germany and Italy. The research involved 59 Cameroonians living in Italy and 49 Cameroonians residing in Germany. The respondents were composed of participants from different linguistic background, students and workers, married and single. A combination of quantitative and qualitative research methods was employed. The field study was divided into three parts. The first part was focused on observing the Cameroonians interact in different places such as in canteens, in the university halls of residence, lecture theatres, at homes, at various Cameroonian meetings. Those observations were accompanied by audio-recordings of the various interactions. The aim was to study communication between Cameroonians to see whether they use Camfranglais or not; if yes, in which domains and what were the speakers’ linguistic profiles. Additionally, questionnaires of different lengths were used to collect biographical information concerning the participants and their sociolinguistic profile and finally, in-depth interviews with Cameroonians were conducted to inquire about the use, the functions and the importance of this language in the migratory context. The results of the research demonstrate how a widespread use of Camfranglais by Cameroonians in Germany and Italy reveal a longing for home on the one hand and a sign of belonging on the other. It also shows the differences that exist between the profiles of Camfranglais speakers in Europe and the speakers in Cameroon notably in terms of age and social class. Finally, it points out some differences in the use, the structure and the functions of this hybrid language in the migratory setting. This study is a contribution to existing research in the field of contact languages and can serve as a comparison for other situations of multilingualism and the creation of mixed languages. Furthermore, with globalization, the study of migrant languages and the contact of these languages with new languages are topics that might be productive for further research in the field of sociolinguistics.

Keywords: interaction, migrants language, multilingualism, mixed languages

Procedia PDF Downloads 190
466 Retrospective Insight on the Changing Status of the Romanian Language Spoken in the Republic of Moldova

Authors: Gina Aurora Necula

Abstract:

From its transformation into a taboo and its hiding under the so-called “Moldovan language” or under the euphemistic expression “state language” to its regained status recognition as an official language, the Romanian language spoken in the Republic of Moldova has undergone impressive reforms in the last 60 years. Meant to erase the awareness of citizens’ ethnic identity and turn a majority language into a minority one, all the laws and regulations issued on the field succeeded into setting numerous barriers for speakers of Romanian. Either manifested as social constraints or materialized into assumed rejection of mother tongue usage, all these laws have demonstrated their usefulness and major impact on the Romanian-speaking population. This article is the result of our research carried out over 10 years with the support of students, and Moldovan citizens, from the master's degree program "Romanian language - identity and cultural awareness." We present here a retrospective insight of the reforms, laws, and regulations that contributed to the shifted status of the Romanian language from the official language, seen as the language of common use both in the public and private spheres, in the minority language that surrendered its privileged place to the Russian language, firstly in the public sphere, and then, slowly but surely, in the private sphere. Our main goal here is to identify and make speakers understand what the barriers to learning Romanian language are nowadays when the social pressure on using Russian no longer exists.

Keywords: linguistic barriers, lingua franca, private sphere, public sphere, reformation

Procedia PDF Downloads 88
465 Investigating Translations of Websites of Pakistani Public Offices

Authors: Sufia Maroof

Abstract:

This empirical study investigated the web-translations of five Pakistani public offices (FPSC, FIA, HEC, USB, and Ministry of Finance) offering Urdu tab as an option to access information on their official websites. Triangulation of quantitative and qualitative research design informed the researcher of the semantic, lexical and syntactic caveats in these translations. The study hypothesized that majority of the Pakistani population is oblivious of the Supreme Court’s amendments in language policy concerning national and official language; hence, Urdu web-translations of the public departments have not been accessed effectively. Firstly, the researcher conducted an online survey, comprising of two sections, close ended and short answer based questions. Secondly, the researcher compiled corpus of the five selected websites in a tabular form to compare the data. Thirdly, the administrators of the departments had been contacted regarding the methods of translation and the expertise of the personnel involved. The corpus was assessed for TQA after examining the lexical, semantic, syntactical and technical alignment inaccuracies and imperfections. The study suggests the public offices to invest in their Urdu webs by either hiring expert translators or engaging expertise of a translation agency for this project to offer quality translation to public.

Keywords: machine translations, public offices, Urdu translations, websites

Procedia PDF Downloads 100
464 Diagnosis of Alzheimer Diseases in Early Step Using Support Vector Machine (SVM)

Authors: Amira Ben Rabeh, Faouzi Benzarti, Hamid Amiri, Mouna Bouaziz

Abstract:

Alzheimer is a disease that affects the brain. It causes degeneration of nerve cells (neurons) and in particular cells involved in memory and intellectual functions. Early diagnosis of Alzheimer Diseases (AD) raises ethical questions, since there is, at present, no cure to offer to patients and medicines from therapeutic trials appear to slow the progression of the disease as moderate, accompanying side effects sometimes severe. In this context, analysis of medical images became, for clinical applications, an essential tool because it provides effective assistance both at diagnosis therapeutic follow-up. Computer Assisted Diagnostic systems (CAD) is one of the possible solutions to efficiently manage these images. In our work; we proposed an application to detect Alzheimer’s diseases. For detecting the disease in early stage we used the three sections: frontal to extract the Hippocampus (H), Sagittal to analysis the Corpus Callosum (CC) and axial to work with the variation features of the Cortex(C). Our method of classification is based on Support Vector Machine (SVM). The proposed system yields a 90.66% accuracy in the early diagnosis of the AD.

Keywords: Alzheimer Diseases (AD), Computer Assisted Diagnostic(CAD), hippocampus, Corpus Callosum (CC), cortex, Support Vector Machine (SVM)

Procedia PDF Downloads 348
463 Differences in Assessing Hand-Written and Typed Student Exams: A Corpus-Linguistic Study

Authors: Jutta Ransmayr

Abstract:

The digital age has long arrived at Austrian schools, so both society and educationalists demand that digital means should be integrated accordingly to day-to-day school routines. Therefore, the Austrian school-leaving exam (A-levels) can now be written either by hand or by using a computer. However, the choice of writing medium (pen and paper or computer) for written examination papers, which are considered 'high-stakes' exams, raises a number of questions that have not yet been adequately investigated and answered until recently, such as: What effects do the different conditions of text production in the written German A-levels have on the component of normative linguistic accuracy? How do the spelling skills of German A-level papers written with a pen differ from those that the students wrote on the computer? And how is the teacher's assessment related to this? Which practical desiderata for German didactics can be derived from this? In a trilateral pilot project of the Austrian Center for Digital Humanities (ACDH) of the Austrian Academy of Sciences and the University of Vienna in cooperation with the Austrian Ministry of Education and the Council for German Orthography, these questions were investigated. A representative Austrian learner corpus, consisting of around 530 German A-level papers from all over Austria (pen and computer written), was set up in order to subject it to a quantitative (corpus-linguistic and statistical) and qualitative investigation with regard to the spelling and punctuation performance of the high school graduates and the differences between pen- and computer-written papers and their assessments. Relevant studies are currently available mainly from the Anglophone world. These have shown that writing on the computer increases the motivation to write, has positive effects on the length of the text, and, in some cases, also on the quality of the text. Depending on the writing situation and other technical aids, better results in terms of spelling and punctuation could also be found in the computer-written texts as compared to the handwritten ones. Studies also point towards a tendency among teachers to rate handwritten texts better than computer-written texts. In this paper, the first comparable results from the German-speaking area are to be presented. Research results have shown that, on the one hand, there are significant differences between handwritten and computer-written work with regard to performance in orthography and punctuation. On the other hand, the corpus linguistic investigation and the subsequent statistical analysis made it clear that not only the teachers' assessments of the students’ spelling performance vary enormously but also the overall assessments of the exam papers – the factor of the production medium (pen and paper or computer) also seems to play a decisive role.

Keywords: exam paper assessment, pen and paper or computer, learner corpora, linguistics

Procedia PDF Downloads 141
462 A Corpus-based Study of Adjuncts in Colombian English as a Second Language (ESL) Argumentative Essays

Authors: E. Velasco

Abstract:

Meeting high standards of writing in a Second Language (L2) is extremely important for many students who wish to undertake studies at universities in both English and non-English speaking countries. University lecturers in English speaking countries continue to express dissatisfaction with the apparent poor quality of essay writing skills displayed by English as a Second Language (ESL) students, whose essays are often criticised for their lack of cohesion and coherence. These critiques have extended to contexts such as Colombia, where many ESL students are criticised for their inability to write high-quality academic texts in L2-English, particularly at the tertiary level. If Colombian ESL students are expected to meet high standards of writing when studying locally and abroad, it makes sense to carry out specific research that can perhaps lead to recommendations to support their quest for improving argumentative strategies. Employing Corpus Linguistics methods within a Learner Corpus Research framework, and a combination of Log-Likelihood and Bayes Factor measures, this paper investigated argumentative essays written by Colombian ESL students. The study specifically aimed to analyse conjunctive adjuncts in argumentative essays to find out how Colombian ESL students connect their ideas in discourse. Results suggest that a) Colombian ESL learners need explicit instruction on specific areas of conjunctive adjuncts to counteract overuse, underuse and misuse; b) underuse of endophoric and evidential adjuncts highlights gaps between IELTS-like essays and good quality tertiary-level essays and published papers, and these gaps are linked to prior knowledge brought into writing task, rhetorical functions in writing, and research processes before writing takes place; c) both Colombian ESL learners and L1-English writers (in a reference corpus) overuse some adjuncts and underuse endophoric and evidential adjuncts, when compared to skilled L1-English and L2-English writers, so differences in frequencies of adjuncts has little to do with the writers’ L1, and differences are rather linked to types of essays writers produce (e.g. ESL vs. university essays). Ender Velasco: The pedagogical recommendations deriving from the study are that: a) Colombian ESL learners need to be shown that overuse is not the only way of giving cohesion to argumentative essays and there are other alternatives to cohesion (e.g., implicit adjuncts, lexical chains and collocations); b) syllabi and classroom input need to raise awareness of gaps in writing skills between IELTS-like and tertiary-level argumentative essays, and of how endophoric and evidential adjuncts are used to refer to anaphoric and cataphoric sections of essays, and to other people’s work or ideas; c) syllabi and classroom input need to include essay-writing tasks based on previous research/reading which learners need to incorporate into their arguments, and tasks that raise awareness of referencing systems (e.g., APA); d) classroom input needs to include explicit instruction on use of punctuation, functions and/or syntax with specific conjunctive adjuncts such as for example, for that reason, although, despite and nevertheless.

Keywords: argumentative essays, colombian english as a second language (esl) learners, conjunctive adjuncts, corpus linguistics

Procedia PDF Downloads 50
461 The Analysis of One Million Reddit Confessions Corpus: The Use of Emotive Verbs and First Person Singular Pronoun as Linguistic Psychotherapy Features

Authors: Natalia Wojarnik

Abstract:

The paper aims to present the analysis of a Reddit confessions corpus. The interpretation focuses on the use of emotional language, in particular emotive verbs, in the context of personal pronouns. The analysis of the linguistic properties answers the question of what the Reddit users confess about and who is the subject of confessions. The study reveals that the specific language patterns used in Reddit confessions reflect the language of depression and the language used by patients during different stages of their psychotherapy sessions. The paper concludes that Reddit users are more willing to confess about their own experiences, not rarely very private and intimate, extensively using the first person singular pronoun I. It indicates that the Reddit users use the language of depression and the language used by psychotherapy patients. The language they use is very emotionally impacted and includes many emotive verbs such as want, feel, need, hate, love. This finding in Reddit confessions correlates with the extensive use of stative affective verbs in the first stages of the psychotherapy sessions. Lastly, the paper refers to the positive and negative lexicon and helps determine how online posts can serve as a depression detector and “talking cure” for the users.

Keywords: confessions, emotional language, emotive verbs, pronouns, first person pronoun, language of depression, depression detection, psychotherapy language

Procedia PDF Downloads 95
460 Towards Kurdish Internet Linguistics: A Case Study on the Impact of Social Media on Kurdish Language

Authors: Karwan K. Abdalrahman

Abstract:

Due to the impacts of the internet and social media, new words and expressions enter the Kurdish language, and a number of familiarized words get new meanings. The case is especially true when the technique of transliteration is taken into consideration. Through transliteration, a number of selected words widely used on social media are entering the Kurdish media discourse. In addition, a selected number of Kurdish words get new cultural and psychological meanings. The significance of this study is to delve into the process of word formation in the Kurdish language and explore how new words and expressions are formed by social media users and got public recognition. First, the study investigates the English words that enter the Kurdish language through different social media platforms. All of these words are transliterated and are used in spoken and written discourses. Second, there are a specific number of Kurdish words that got new meanings in social media. As for these words, there are psychological and cultural factors that make people use these expressions for specific political reasons. It can be argued that they have an indirect political message along with their new linguistic usages. This is a qualitative study analyzing video content that was published in the last two years on social media platforms, including Facebook and YouTube. The collected data was analyzed based on the themes discussed above. The findings of the research can be summarized as follows: the widely used transliterated words have entered both the spoken and written discourses. Authors in online and offline newspapers, TV presenters, literary writers, columnists are using these new expressions in their writings. As for the Kurdish words with new meanings, they are also widely used for psychological, cultural, and political reasons.

Keywords: Kurdish language, social media, new meanings, transliteration, vocabulary

Procedia PDF Downloads 159
459 The Usage of Negative Emotive Words in Twitter

Authors: Martina Katalin Szabó, István Üveges

Abstract:

In this paper, the usage of negative emotive words is examined on the basis of a large Hungarian twitter-database via NLP methods. The data is analysed from a gender point of view, as well as changes in language usage over time. The term negative emotive word refers to those words that, on their own, without context, have semantic content that can be associated with negative emotion, but in particular cases, they may function as intensifiers (e.g. rohadt jó ’damn good’) or a sentiment expression with positive polarity despite their negative prior polarity (e.g. brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’. Based on the findings of several authors, the same phenomenon can be found in other languages, so it is probably a language-independent feature. For the recent analysis, 67783 tweets were collected: 37818 tweets (19580 tweets written by females and 18238 tweets written by males) in 2016 and 48344 (18379 tweets written by females and 29965 tweets written by males) in 2021. The goal of the research was to make up two datasets comparable from the viewpoint of semantic changes, as well as from gender specificities. An exhaustive lexicon of Hungarian negative emotive intensifiers was also compiled (containing 214 words). After basic preprocessing steps, tweets were processed by ‘magyarlanc’, a toolkit is written in JAVA for the linguistic processing of Hungarian texts. Then, the frequency and collocation features of all these words in our corpus were automatically analyzed (via the analysis of parts-of-speech and sentiment values of the co-occurring words). Finally, the results of all four subcorpora were compared. Here some of the main outcomes of our analyses are provided: There are almost four times fewer cases in the male corpus compared to the female corpus when the negative emotive intensifier modified a negative polarity word in the tweet (e.g., damn bad). At the same time, male authors used these intensifiers more frequently, modifying a positive polarity or a neutral word (e.g., damn good and damn big). Results also pointed out that, in contrast to female authors, male authors used these words much more frequently as a positive polarity word as well (e.g., brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’). We also observed that male authors use significantly fewer types of emotive intensifiers than female authors, and the frequency proportion of the words is more balanced in the female corpus. As for changes in language usage over time, some notable differences in the frequency and collocation features of the words examined were identified: some of the words collocate with more positive words in the 2nd subcorpora than in the 1st, which points to the semantic change of these words over time.

Keywords: gender differences, negative emotive words, semantic changes over time, twitter

Procedia PDF Downloads 179
458 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 117
457 Improving Low English Oral Skills of 5 Second-Year English Major Students at Debark University

Authors: Belyihun Muchie

Abstract:

This study investigates the low English oral communication skills of 5 second-year English major students at Debark University. It aims to identify the key factors contributing to their weaknesses and propose effective interventions to improve their spoken English proficiency. Mixed-methods research will be employed, utilizing observations, questionnaires, and semi-structured interviews to gather data from the participants. To clearly identify these factors, structured and informal observations will be employed; the former will be used to identify their fluency, pronunciation, vocabulary use, and grammar accuracy, and the later will be suited to observe the natural interactions and communication patterns of learners in the classroom setting. The questionnaires will assess their self-perceptions of their skills, perceived barriers to fluency, and preferred learning styles. Interviews will also delve deeper into their experiences and explore specific obstacles faced in oral communication. Data analysis will involve both quantitative and qualitative responses. The structured observation and questionnaire will be analyzed quantitatively, whereas the informal observation and interview transcripts will be analyzed thematically. Findings will be used to identify the major causes of low oral communication skills, such as limited vocabulary, grammatical errors, pronunciation difficulties, or lack of confidence. They are also helpful to develop targeted solutions addressing these causes, such as intensive pronunciation practice, conversation simulations, personalized feedback, or anxiety-reduction techniques. Finally, the findings will guide designing an intervention plan for implementation during the action research phase. The study's outcomes are expected to provide valuable insights into the challenges faced by English major students in developing oral communication skills, contribute to the development of evidence-based interventions for improving spoken English proficiency in similar contexts, and offer practical recommendations for English language instructors and curriculum developers to enhance student learning outcomes. By addressing the specific needs of these students and implementing tailored interventions, this research aims to bridge the gap between theoretical knowledge and practical speaking ability, equipping them with the confidence and skills to flourish in English communication settings.

Keywords: oral communication skills, mixed-methods, evidence-based interventions, spoken English proficiency

Procedia PDF Downloads 26
456 Corpus-Based Analysis on the Translatability of Conceptual Vagueness in Traditional Chinese Medicine Classics Huang Di Nei Jing

Authors: Yan Yue

Abstract:

Huang Di Nei Jing (HDNJ) is one of the significant traditional Chinese medicine (TCM) classics which lays the foundation of TCM theory and practice. It is an important work for the world to study the ancient civilizations and medical history of China. Language in HDNJ is highly concise and vague, and notably challenging to translate. This paper investigates the translatability of one particular vagueness in HDNJ: the conceptual vagueness which carries the Chinese philosophical and cultural connotations. The corpora tool Sketch Engine is used to provide potential online contexts and word behaviors. Selected two English translations of HDNJ by TCM practitioner and non-practitioner are used to examine frequency and distribution of linguistic features of the translation. It was found the hypothesis about the universals of translated language (explicitation, normalisation) is true in one translation, but it is on the sacrifice of some original contextual connotations. Transliteration is purposefully used in the second translation to retain the original flavor, which is argued as a violation of the principle of relevance in communication because it yields little contextual effects and demands more processing effort of the reader. The translatability of conceptual vagueness in HDNJ is constrained by source language context and the reader’s cognitive environment.

Keywords: corpus-based translation, translatability, TCM classics, vague language

Procedia PDF Downloads 345
455 The Noun-Phrase Elements on the Usage of the Zero Article

Authors: Wen Zhen

Abstract:

Compared to content words, function words have been relatively overlooked by English learners especially articles. The article system, to a certain extent, becomes a resistance to know English better, driven by different elements. Three principal factors can be summarized in term of the nature of the articles when referring to the difficulty of the English article system. However, making the article system more complex are difficulties in the second acquisition process, for [-ART] learners have to create another category, causing even most non-native speakers at proficiency level to make errors. According to the sequences of acquisition of the English article, it is showed that the zero article is first acquired and in high inaccuracy. The zero article is often overused in the early stages of L2 acquisition. Although learners at the intermediate level move to underuse the zero article for they realize that the zero article does not cover any case, overproduction of the zero article even occurs among advanced L2 learners. The aim of the study is to investigate noun-phrase factors which give rise to incorrect usage or overuse of the zero article, thus providing suggestions for L2 English acquisition. Moreover, it enables teachers to carry out effective instruction that activate conscious learning of students. The research question will be answered through a corpus-based, data- driven approach to analyze the noun-phrase elements from the semantic context and countability of noun-phrases. Based on the analysis of the International Thurber Thesis corpus, the results show that: (1) Although context of [-definite,-specific] favored the zero article, both[-definite,+specific] and [+definite,-specific] showed less influence. When we reflect on the frequency order of the zero article , prototypicality plays a vital role in it .(2)EFL learners in this study have trouble classifying abstract nouns as countable. We can find that it will bring about overuse of the zero article when learners can not make clear judgements on countability altered from (+definite ) to (-definite).Once a noun is perceived as uncountable by learners, the choice would fall back on the zero article. These findings suggest that learners should be engaged in recognition of the countability of new vocabulary by explaining nouns in lexical phrases and explore more complex aspects such as analysis dependent on discourse.

Keywords: noun phrase, zero article, corpus, second language acquisition

Procedia PDF Downloads 229