Search results for: spoken corpus
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 592

Search results for: spoken corpus

472 Query in Grammatical Forms and Corpus Error Analysis

Authors: Katerina Florou

Abstract:

Two decades after coined the term "learner corpora" as collections of texts created by foreign or second language learners across various language contexts, and some years following suggestion to incorporate "focusing on form" within a Task-Based Learning framework, this study aims to explore how learner corpora, whether annotated with errors or not, can facilitate a focus on form in an educational setting. Argues that analyzing linguistic form serves the purpose of enabling students to delve into language and gain an understanding of different facets of the foreign language. This same objective is applicable when analyzing learner corpora marked with errors or in their raw state, but in this scenario, the emphasis lies on identifying incorrect forms. Teachers should aim to address errors or gaps in the students' second language knowledge while they engage in a task. Building on this recommendation, we compared the written output of two student groups: the first group (G1) employed the focusing on form phase by studying a specific aspect of the Italian language, namely the past participle, through examples from native speakers and grammar rules; the second group (G2) focused on form by scrutinizing their own errors and comparing them with analogous examples from a native speaker corpus. In order to test our hypothesis, we created four learner corpora. The initial two were generated during the task phase, with one representing each group of students, while the remaining two were produced as a follow-up activity at the end of the lesson. The results of the first comparison indicated that students' exposure to their own errors can enhance their grasp of a grammatical element. The study is in its second stage and more results are to be announced.

Keywords: Corpus interlanguage analysis, task based learning, Italian language as F1, learner corpora

Procedia PDF Downloads 27
471 The Construction of Malaysian Airline Tragedies in Malaysian and British Online News: A Multidisciplinary Study

Authors: Theng Theng Ong

Abstract:

This study adopts a multidisciplinary method by combining the corpus-based discourse analysis study and language attitude study to explore the construction of Malaysia airline tragedies: MH370, MH17 and QZ8501 in the selected Malaysian and United Kingdom (UK) online news. The study aims to determine the ways in which Malaysian Airline tragedies MH370, MH17 and QZ8501 are linguistically defined and constructed in terms of keyword and collocation. The study also seeks to identify the types of discourse that are presented in the new articles. The differences or similarities in terms of keywords, topics or issues covered by the selected Malaysian and UK news media will also be examined. Finally, the language attitude study will be carried out to examine the Malaysia and UK university students’ attitudes toward the keywords, topics or issues covered by the selected Malaysian and UK news media pertaining to Malaysian Airline tragedies MH370, MH17 and QZ8501. The analysis is divided into two parts with the first part focusing on corpus-based discourse analysis on the media text. The second part of the study is to investigate Malaysians and UK news readers’ attitudes towards the online news being reported by the Malaysian and UK news media pertaining to the Airline tragedies. The main findings of corpus-based discourse analysis are essential in designing the questions in the questionnaires and interview and therefore led to the identification of the attitudes among Malaysian and UK news. This study adopts a multidisciplinary method by combining the corpus-based discourse analysis study and language attitude study to explore the construction of Malaysia airline tragedies: MH370, MH17 and QZ8501 in the selected Malaysian and United Kingdom (UK) online news. The study aims to determine the ways in which Malaysian Airline tragedies MH370, MH17 and QZ8501 are linguistically defined and constructed in terms of keyword and collocation. The study also seeks to identify the types of discourse that are presented in the new articles. The differences or similarities in terms of keywords, topics or issues covered by the selected Malaysian and UK news media will also be examined. Finally, the language attitude study will be carried out to examine the Malaysia and UK university students’ attitudes toward the keywords, topics or issues covered by the selected Malaysian and UK news media pertaining to Malaysian Airline tragedies MH370, MH17 and QZ8501. The analysis is divided into two parts with the first part focusing on corpus-based discourse analysis on the media text. The second part of the study is to investigate Malaysians and UK news readers’ attitudes towards the online news being reported by the Malaysian and UK news media pertaining to the Airline tragedies. The main findings of corpus-based discourse analysis are essential in designing the questions in the questionnaires and interview and therefore led to the identification of the attitudes among Malaysian and UK news.

Keywords: corpus linguistics, critical discourse analysis, news media, tragedies study

Procedia PDF Downloads 317
470 We Are the 99 percent – the Occupy-Movement in Social Media

Authors: Wolfram Karg

Abstract:

The Occupy-Movement came into in 2011 existence in the US as a reaction to one of the worst economic crisis since World War II. With cuts in benefits and social services, with people being evicted from their homes on the one hand and high bonuses granted to their managers of the very same companies, a strong feeling of injustice besieged people in the US and caused them to voice their anger peacefully in social media and on the streets. Due to the world-wide-web, users all around the world read about this movement and recognized the same injustice in their own countries, making Occupy a global movement. The vast array of topics covered by Occupy offers a unique chance to carry out a corpus-based discourse analysis based on the DIMEAN-Model. The focus on this paper is limited to two aspects of DIMEAN: intertextual references and the use of connectors in texts. Because the discourse is to a large extent carried out via posts in blogs, online-articles and comments, the paper also analyses, in how far modern (i.e. computer-based media) there is a correlation between the use of connectors in different communicative types used by the Occupy-Movement.

Keywords: discourse, new media, occupy, corpus analysis

Procedia PDF Downloads 472
469 Lexical Collocations in Medical Articles of Non-Native vs Native English-Speaking Researchers

Authors: Waleed Mandour

Abstract:

This study presents multidimensional scrutiny of Benson et al.’s seven-category taxonomy of lexical collocations used by Egyptian medical authors and their peers of native-English speakers. It investigates 212 medical papers, all published during a span of 6 years (from 2013 to 2018). The comparison is held to the medical research articles submitted by native speakers of English (25,238 articles in total with over 103 million words) as derived from the Directory of Open Access Journals (a 2.7 billion-word corpus). The non-native speakers compiled corpus was properly annotated and marked-up manually by the researcher according to the standards of Weisser. In terms of statistical comparisons, though, deployed were the conventional frequency-based analysis besides the relevant criteria, such as association measures (AMs) in which LogDice is deployed as per the recommendation of Kilgariff et al. when comparing large corpora. Despite the terminological convergence in the subject corpora, comparison results confirm the previous literature of which the non-native speakers’ compositions reveal limited ranges of lexical collocations in terms of their distribution. However, there is a ubiquitous tendency of overusing the NS-high-frequency multi-words in all lexical categories investigated. Furthermore, Egyptian authors, conversely to their English-speaking peers, tend to embrace more collocations denoting quantitative rather than qualitative analyses in their produced papers. This empirical work, per se, contributes to the English for Academic Purposes (EAP) and English as a Lingua Franca in Academic settings (ELFA). In addition, there are pedagogical implications that would promote a better quality of medical research papers published in Egyptian universities.

Keywords: corpus linguistics, EAP, ELFA, lexical collocations, medical discourse

Procedia PDF Downloads 111
468 Sentence Structure for Free Word Order Languages in Context with Anaphora Resolution: A Case Study of Hindi

Authors: Pardeep Singh, Kamlesh Dutta

Abstract:

Many languages have fixed sentence structure and others are free word order. The accuracy of anaphora resolution of syntax based algorithm depends on structure of the sentence. So, it is important to analyze the structure of any language before implementing these algorithms. In this study, we analyzed the sentence structure exploiting the case marker in Hindi as well as some special tag for subject and object. We also investigated the word order for Hindi. Word order typology refers to the study of the order of the syntactic constituents of a language. We analyzed 165 news items of Ranchi Express from EMILEE corpus of plain text. It consisted of 1745 sentences. Eight file of dialogue based from the same corpus has been analyzed which will have 1521 sentences. The percentages of subject object verb structure (SOV) and object subject verb (OSV) are 66.90 and 33.10, respectively.

Keywords: anaphora resolution, free word order languages, SOV, OSV

Procedia PDF Downloads 452
467 A Corpus Output Error Analysis of Chinese L2 Learners From America, Myanmar, and Singapore

Authors: Qiao-Yu Warren Cai

Abstract:

Due to the rise of big data, building corpora and using them to analyze ChineseL2 learners’ language output has become a trend. Various empirical research has been conducted using Chinese corpora built by different academic institutes. However, most of the research analyzed the data in the Chinese corpora usingcorpus-based qualitative content analysis with descriptive statistics. Descriptive statistics can be used to make summations about the subjects or samples that research has actually measured to describe the numerical data, but the collected data cannot be generalized to the population. Comte, a Frenchpositivist, has argued since the 19th century that human beings’ knowledge, whether the discipline is humanistic and social science or natural science, should be verified in a scientific way to construct a universal theory to explain the truth and human beings behaviors. Inferential statistics, able to make judgments of the probability of a difference observed between groups being dependable or caused by chance (Free Geography Notes, 2015)and to infer from the subjects or examples what the population might think or behave, is just the right method to support Comte’s argument in the field of TCSOL. Also, inferential statistics is a core of quantitative research, but little research has been conducted by combing corpora with inferential statistics. Little research analyzes the differences in Chinese L2 learners’ language corpus output errors by using theOne-way ANOVA so that the findings of previous research are limited to inferring the population's Chinese errors according to the given samples’ Chinese corpora. To fill this knowledge gap in the professional development of Taiwanese TCSOL, the present study aims to utilize the One-way ANOVA to analyze corpus output errors of Chinese L2 learners from America, Myanmar, and Singapore. The results show that no significant difference exists in ‘shì (是) sentence’ and word order errors, but compared with Americans and Singaporeans, it is significantly easier for Myanmar to have ‘sentence blends.’ Based on the above results, the present study provides an instructional approach and contributes to further exploration of how Chinese L2 learners can have (and use) learning strategies to lower errors.

Keywords: Chinese corpus, error analysis, one-way analysis of variance, Chinese L2 learners, Americans, myanmar, Singaporeans

Procedia PDF Downloads 85
466 Competition between Verb-Based Implicit Causality and Theme Structure's Influence on Anaphora Bias in Mandarin Chinese Sentences: Evidence from Corpus

Authors: Linnan Zhang

Abstract:

Linguists, as well as psychologists, have shown great interests in implicit causality in reference processing. However, most frequently-used approaches to this issue are psychological experiments (such as eye tracking or self-paced reading, etc.). This research is a corpus-based one and is assisted with statistical tool – software R. The main focus of the present study is about the competition between verb-based implicit causality and theme structure’s influence on anaphora bias in Mandarin Chinese sentences. In Accessibility Theory, it is believed that salience, which is also known as accessibility, and relevance are two important factors in reference processing. Theme structure, which is a special syntactic structure in Chinese, determines the salience of an antecedent on the syntactic level while verb-based implicit causality is a key factor to the relevance between antecedent and anaphora. Therefore, it is a study about anaphora, combining psychology with linguistics. With analysis of the sentences from corpus as well as the statistical analysis of Multinomial Logistic Regression, major findings of the present study are as follows: 1. When the sentence is stated in a ‘cause-effect’ structure, the theme structure will always be the antecedent no matter forward biased verbs or backward biased verbs co-occur; in non-theme structure, the anaphora bias will tend to be the opposite of the verb bias; 2. When the sentence is stated in a ‘effect-cause’ structure, theme structure will not always be the antecedent and the influence of verb-based implicit causality will outweigh that of theme structure; moreover, the anaphora bias will be the same with the bias of verbs. All the results indicate that implicit causality functions conditionally and the noun in theme structure will not be the high-salience antecedent under any circumstances.

Keywords: accessibility theory, anaphora, theme strcture, verb-based implicit causality

Procedia PDF Downloads 175
465 Phonology and Syntax of Article Incorporation in Mauritian Creole: Evidence from Bantou Languages

Authors: Emmanuel Nikiema

Abstract:

This paper examines article incorporation in Mauritian Creole, a French Lexifier Creole which exhibits three forms of article incorporation as illustrated in (1-3). While various analyses of article incorporation have been proposed in the literature, fewer studies have explored the motivation of this widespread phenomenon in Mauritian Creole (MC) as opposed to other French Lexifier Creoles spoken in the Caribbean. For example, Mauritian Creole exhibits 4 times more CV incorporation than Haitian Creole, and 40 times more than Reunion Creole. (1) Consonantal type (C): loraz ‘thunder storm’, lete ‘summer’, zwazo ‘bird’, nide ‘idea’. (2) Syllabic type (CV): lapo ‘skin’, liku ‘neck’, ledo ‘back’, leker ‘heart’, diber ‘butter’. (3) Bi-consonantal (CVC): delo ‘water’, dizef ‘egg’, lizye ‘eye’, dilwil ‘oil’. The goal of this study is twofold: 1) uncover the rules governing the three types of article incorporation in MC, and 2) account for its remarkable occurrence in MC as opposed to its quasi-absence in Reunion Creole. We have collected a corpus of over 700 cases and organized it into three categories (C; CV and CVC). For example, there are 471 examples of CV incorporation in MC against 112 in Haitian Creole and only 12 in Reunion Creole. Two questions can be raised: 1) what is the motivation and distribution of the three types of incorporation in MC, and 2) how can one account for the high volume of incorporation in MC as opposed to its quasi-absence in Reunion Creole? We suggest that article incorporation in MC is related to the structure of nouns in Bantou languages. While previous authors have largely used population settlement data in the colonies during the Creole formation period to justify their analyses, we propose an account based on the syntactic structure of Bantou nouns. This analysis will shed light on the contribution of African languages to the formation of MC, and on to why MC has exhibited more article incorporation cases than any other French Lexifier Creole.

Keywords: article incorporation, creole languages, description, phonology

Procedia PDF Downloads 92
464 Named Entity Recognition System for Tigrinya Language

Authors: Sham Kidane, Fitsum Gaim, Ibrahim Abdella, Sirak Asmerom, Yoel Ghebrihiwot, Simon Mulugeta, Natnael Ambassager

Abstract:

The lack of annotated datasets is a bottleneck to the progress of NLP in low-resourced languages. The work presented here consists of large-scale annotated datasets and models for the named entity recognition (NER) system for the Tigrinya language. Our manually constructed corpus comprises over 340K words tagged for NER, with over 118K of the tokens also having parts-of-speech (POS) tags, annotated with 12 distinct classes of entities, represented using several types of tagging schemes. We conducted extensive experiments covering convolutional neural networks and transformer models; the highest performance achieved is 88.8% weighted F1-score. These results are especially noteworthy given the unique challenges posed by Tigrinya’s distinct grammatical structure and complex word morphologies. The system can be an essential building block for the advancement of NLP systems in Tigrinya and other related low-resourced languages and serve as a bridge for cross-referencing against higher-resourced languages.

Keywords: Tigrinya NER corpus, TiBERT, TiRoBERTa, BiLSTM-CRF

Procedia PDF Downloads 84
463 Adjectives in Academic Discourse: A Comparative Study of Research Articles

Authors: Beata Grymska

Abstract:

The research studies on academic discourse focus in general on lexical bundles, epistemic modality markers, or interactions between writers and readers. Following the research into the written forms of the academic community, this study concentrates on adjectives in research articles. The study investigates the distribution of adjectives in research articles in two academic disciplines: linguistics and medicine. It is corpus-based in design and consists of 100 linguistic and 100 medical research articles all written in English. The aim of the study is to compare the distribution of adjectives between the two corpora and four main parts of articles: IMRD (Introduction, Methods, Results, and Discussion). The second aim is to see if the two corpora share common core adjectives, e.g., different, important, specific, and if there are discipline-specific adjectives. The further part of the paper elaborates on adjectives use in the corpora together with examples. The results indicate that the two corpora do not differ in the distribution of adjectives to a great extent. The occurrences of the most frequently used adjectives depend on the academic discipline of the research articles. The concluding part reflects upon the role of adjectives in academic discourse and also presents how corpora can be helpful in composing academic texts.

Keywords: academic discourse, academic texts, adjectives, corpus analysis, research articles

Procedia PDF Downloads 164
462 Concepts of Creation and Destruction as Cognitive Instruments in World View Study

Authors: Perizat Balkhimbekova

Abstract:

Evolutionary changes in cognitive world view taking place in the last decades are followed by changes in perception of the key concepts which are related to the certain lingua-cultural sphere. Also, such concepts reflect the person’s attitude to essential processes in the sphere of concepts, e.g. the opposite operations like creation and destruction. These changes in people’s life and thinking are displayed in a language world view. In order to open the maintenance of mental structures and concepts we should use language means as observable results of people’s cognitive activity. Semantics of words, free phrases and idioms should be considered as an authoritative source of information concerning concepts. The regularized set of concepts in people consciousness forms the sphere of concepts. Cognitive linguistics widely discusses the sphere of concepts as its crucial category defining it as the field of knowledge which is made of concepts. It is considered that a sphere of concepts comprises the various types of association and forms conceptual fields. As a material for the given research, the data from Russian National Corpus and British National Corpus were used. In is necessary to point out that data provided by computational studies, are intrinsic and verifiable; so that we have used them in order to get the reliable results. The procedure of study was based on such techniques as extracting of the context containing concepts of creation|destruction from the Russian National Corpus (RNC), and British National Corpus (BNC); analyzing and interpreting of those context on the basis of cognitive approach; finding of correspondence between the given concepts in the Russian and English world view. The key problem of our study is to find the correspondence between the elements of world view represented by opposite concepts such as creation and destruction. Findings: The concept of "destruction" indicates a process which leads to full or partial destruction of an object. In other words, it is a loss of the object primary essence: structures, properties, distinctive signs and its initial integrity. The concept of "creation", on the contrary, comprises positive characteristics, represents the activity aimed at improvement of the certain object, at the creation of ideal models of the world. On the other hand, destruction is represented much more widely in RNC than creation (1254 cases of the first concept by comparison to 192 cases for the second one). Our hypothesis consists in the antinomy represented by the aforementioned concepts. Being opposite both in respect of semantics and pragmatics, and from the point of view of axiology, they are at the same time complementary and interrelated concepts.

Keywords: creation, destruction, concept, world view

Procedia PDF Downloads 326
461 'Wandering Uterus': An Analogy of Perception of Women in Hippocratic Corpus and Post-Modern Times

Authors: Ankita Sharma

Abstract:

The study proposes to review the perception of women in the Classical Age (500-336 BC) when Greek Philosophy was in bloom. It was observed that women had very few rights and were still under the control of men. One of the possible reasons for this exclusion was woman’s biology that had a huge influence on her being seen as inferior to men. The text ‘Hippocratic Corpus’ focuses on the biological construct of the female body in classical Greek science that perpetuated the idea of women as second-class citizens and were considered inherently weaker than men. The research highlights the significance of the text that was used to encourage women of that time to get married and produce children and how till today the perception remains the same. The Greek belief of need for confinement and control of 'wandering uterus' has led to superior understanding of men. The pivotal emphasis of this research is to women and their bodies that are depicted in a misogynistic way which paved the way for Hippocratic writers to influence the society’s attitude towards women in their writings. It is intended to draw attention to the prevailing cultural assumptions and preconceived notions about female anatomy that had a pervasive influence in the following centuries with its roots being in ancient science.

Keywords: classical Greek theory, women, wandering womb, modern ideology

Procedia PDF Downloads 172
460 A Corpus-Based Contrastive Analysis of Directive Speech Act Verbs in English and Chinese Legal Texts

Authors: Wujian Han

Abstract:

In the process of human interaction and communication, speech act verbs are considered to be the most active component and the main means for information transmission, and are also taken as an indication of the structure of linguistic behavior. The theoretical value and practical significance of such everyday built-in metalanguage have long been recognized. This paper, which is part of a bigger study, is aimed to provide useful insights for a more precise and systematic application to speech act verbs translation between English and Chinese, especially with regard to the degree to which generic integrity is maintained in the practice of translation of legal documents. In this study, the corpus, i.e. Chinese legal texts and their English translations, English legal texts, ordinary Chinese texts, and ordinary English texts, serve as a testing ground for examining contrastively the usage of English and Chinese directive speech act verbs in legal genre. The scope of this paper is relatively wide and essentially covers all directive speech act verbs which are used in ordinary English and Chinese, such as order, command, request, prohibit, threat, advice, warn and permit. The researcher, by combining the corpus methodology with a contrastive perspective, explored a range of characteristics of English and Chinese directive speech act verbs including their semantic, syntactic and pragmatic features, and then contrasted them in a structured way. It has been found that there are similarities between English and Chinese directive speech act verbs in legal genre, such as similar semantic components between English speech act verbs and their translation equivalents in Chinese, formal and accurate usage of English and Chinese directive speech act verbs in legal contexts. But notable differences have been identified in areas of difference between their usage in the original Chinese and English legal texts such as valency patterns and frequency of occurrences. For example, the subjects of some directive speech act verbs are very frequently omitted in Chinese legal texts, but this is not the case in English legal texts. One of the practicable methods to achieve adequacy and conciseness in speech act verb translation from Chinese into English in legal genre is to repeat the subjects or the message with discrepancy, and vice versa. In addition, translation effects such as overuse and underuse of certain directive speech act verbs are also found in the translated English texts compared to the original English texts. Legal texts constitute a particularly valuable material for speech act verb study. Building up such a contrastive picture of the Chinese and English speech act verbs in legal language would yield results of value and interest to legal translators and students of language for legal purposes and have practical application to legal translation between English and Chinese.

Keywords: contrastive analysis, corpus-based, directive speech act verbs, legal texts, translation between English and Chinese

Procedia PDF Downloads 466
459 English Language Teaching Graduate Students' Use of Discussion Moves in Research Articles

Authors: Gamzegul Koca, Evrim Eveyik-Aydin

Abstract:

Genre and discipline-specific knowledge of academic discourse in writing has long been acknowledged as being a core skill to achieve formidable tasks that are expected of graduate students in academic settings. Genre analysis approaches can be adopted to unveil the challenges encountered in these tasks to be able to take instructional actions addressing the aspects of graduate writing that need improvement. In an attempt to find genre-specific academic writing needs of Turkish students enrolled in a graduate program in ELT, this study examines the rhetorical structure of discussion sections of research articles written during the course load stage of their graduate studies. The 35.437-word specialized corpus of graduate papers compiled for the purpose of the study includes discussions of 58 unpublished reports of empirical studies, 31 written in MA courses and 27 in Ph.D. courses by a total of 44 graduate students. The study does sentence-based move structure analysis using the framework developed by Eveyik-Aydın, Karabacak and Akyel in a corpus-based study that analyzed the discussion moves of expert writers in published articles in ELT journals indexed by Social Sciences Citation. The coding of 1577 sentences by three graders using this framework revealed that while the graduate papers included the same moves used in published articles, the rhetorical structure of MA and Ph.D. papers showed considerable differences in terms of the frequency of occurrence of main discussion moves, including interpretation of the results and drawing implications. The implications of these findings will be discussed with respect to the needs of graduate writers and the expectations of discourse community.

Keywords: discussion moves, genre-specific rhetorical structure, move analysis, research articles, the specialized corpus of graduate papers

Procedia PDF Downloads 148
458 Agenesis of the Corpus Callosum: The Role of Neuropsychological Assessment with Implications to Psychosocial Rehabilitation

Authors: Ron Dick, P. S. D. V. Prasadarao, Glenn Coltman

Abstract:

Agenesis of the corpus callosum (ACC) is a failure to develop corpus callosum - the large bundle of fibers of the brain that connects the two cerebral hemispheres. It can occur as a partial or complete absence of the corpus callosum. In the general population, its estimated prevalence rate is 1 in 4000 and a wide range of genetic, infectious, vascular, and toxic causes have been attributed to this heterogeneous condition. The diagnosis of ACC is often achieved by neuroimaging procedures. Though persons with ACC can perform normally on intelligence tests they generally present with a range of neuropsychological and social deficits. The deficit profile is characterized by poor coordination of motor movements, slow reaction time, processing speed and, poor memory. Socially, they present with deficits in communication, language processing, the theory of mind, and interpersonal relationships. The present paper illustrates the role of neuropsychological assessment with implications to psychosocial management in a case of agenesis of the corpus callosum. Method: A 27-year old left handed Caucasian male with a history of ACC was self-referred for a neuropsychological assessment to assist him in his employment options. Parents noted significant difficulties with coordination and balance at an early age of 2-3 years and he was diagnosed with dyspraxia at the age of 14 years. History also indicated visual impairment, hypotonia, poor muscle coordination, and delayed development of motor milestones. MRI scan indicated agenesis of the corpus callosum with ventricular morphology, widely spaced parallel lateral ventricles and mild dilatation of the posterior horns; it also showed colpocephaly—a disproportionate enlargement of the occipital horns of the lateral ventricles which might be affecting his motor abilities and visual defects. The MRI scan ruled out other structural abnormalities or neonatal brain injury. At the time of assessment, the subject presented with such problems as poor coordination, slowed processing speed, poor organizational skills and time management, and difficulty with social cues and facial expressions. A comprehensive neuropsychological assessment was planned and conducted to assist in identifying the current neuropsychological profile to facilitate the formulation of a psychosocial and occupational rehabilitation programme. Results: General intellectual functioning was within the average range and his performance on memory-related tasks was adequate. Significant visuospatial and visuoconstructional deficits were evident across tests; constructional difficulties were seen in tasks such as copying a complex figure, building a tower and manipulating blocks. Poor visual scanning ability and visual motor speed were evident. Socially, the subject reported heightened social anxiety, difficulty in responding to cues in the social environment, and difficulty in developing intimate relationships. Conclusion: Persons with ACC are known to present with specific cognitive deficits and problems in social situations. Findings from the current neuropsychological assessment indicated significant visuospatial difficulties, poor visual scanning and problems in social interactions. His general intellectual functioning was within the average range. Based on the findings from the comprehensive neuropsychological assessment, a structured psychosocial rehabilitation programme was developed and recommended.

Keywords: agenesis, callosum, corpus, neuropsychology, psychosocial, rehabilitation

Procedia PDF Downloads 260
457 The Effects of Culture and Language on Social Impression Formation from Voice Pleasantness: A Study with French and Iranian People

Authors: L. Bruckert, A. Mansourzadeh

Abstract:

The voice has a major influence on interpersonal communication in everyday life via the perception of pleasantness. The evolutionary perspective postulates that the mechanisms underlying the pleasantness judgments are universal adaptations that have evolved in the service of choosing a mate (through the process of sexual selection). From this point of view, the favorite voices would be those with more marked sexually dimorphic characteristics; for example, in men with lower voice pitch, pitch is the main criterion. On the other hand, one can postulate that the mechanisms involved are gradually established since childhood through exposure to the environment, and thus the prosodic elements could take precedence in everyday life communication as it conveys information about the speaker's attitude (willingness to communicate, interest toward the interlocutors). Our study focuses on voice pleasantness and its relationship with social impression formation, exploring both the spectral aspects (pitch, timbre) and the prosodic ones. In our study, we recorded the voices through two vocal corpus (five vowels and a reading text) of 25 French males speaking French and 25 Iranian males speaking Farsi. French listeners (40 male/40 female) listened to the French voices and made a judgment either on the voice's pleasantness or on the speaker (judgment about his intelligence, honesty, sociability). The regression analyses from our acoustic measures showed that the prosodic elements (for example, the intonation and the speech rate) are the most important criteria concerning pleasantness, whatever the corpus or the listener's gender. Moreover, the correlation analyses showed that the speakers with the voices judged as the most pleasant are considered the most intelligent, sociable, and honest. The voices in Farsi have been judged by 80 other French listeners (40 male/40 female), and we found the same effect of intonation concerning the judgment of pleasantness with the corpus «vowel» whereas with the corpus «text» the pitch is more important than the prosody. It may suggest that voice perception contains some elements invariant across culture/language, whereas others are influenced by the cultural/linguistic background of the listener. Shortly in the future, Iranian people will be asked to listen either to the French voices for half of them or to the Farsi voices for the other half and produce the same judgments as the French listeners. This experimental design could potentially make it possible to distinguish what is linked to culture and what is linked to language in the case of differences in voice perception.

Keywords: cross-cultural psychology, impression formation, pleasantness, voice perception

Procedia PDF Downloads 46
456 Linguistic Cyberbullying, a Legislative Approach

Authors: Simona Maria Ignat

Abstract:

Bullying online has been an increasing studied topic during the last years. Different approaches, psychological, linguistic, or computational, have been applied. To our best knowledge, a definition and a set of characteristics of phenomenon agreed internationally as a common framework are still waiting for answers. Thus, the objectives of this paper are the identification of bullying utterances on Twitter and their algorithms. This research paper is focused on the identification of words or groups of words, categorized as “utterances”, with bullying effect, from Twitter platform, extracted on a set of legislative criteria. This set is the result of analysis followed by synthesis of law documents on bullying(online) from United States of America, European Union, and Ireland. The outcome is a linguistic corpus with approximatively 10,000 entries. The methods applied to the first objective have been the following. The discourse analysis has been applied in identification of keywords with bullying effect in texts from Google search engine, Images link. Transcription and anonymization have been applied on texts grouped in CL1 (Corpus linguistics 1). The keywords search method and the legislative criteria have been used for identifying bullying utterances from Twitter. The texts with at least 30 representations on Twitter have been grouped. They form the second corpus linguistics, Bullying utterances from Twitter (CL2). The entries have been identified by using the legislative criteria on the the BoW method principle. The BoW is a method of extracting words or group of words with same meaning in any context. The methods applied for reaching the second objective is the conversion of parts of speech to alphabetical and numerical symbols and writing the bullying utterances as algorithms. The converted form of parts of speech has been chosen on the criterion of relevance within bullying message. The inductive reasoning approach has been applied in sampling and identifying the algorithms. The results are groups with interchangeable elements. The outcomes convey two aspects of bullying: the form and the content or meaning. The form conveys the intentional intimidation against somebody, expressed at the level of texts by grammatical and lexical marks. This outcome has applicability in the forensic linguistics for establishing the intentionality of an action. Another outcome of form is a complex of graphemic variations essential in detecting harmful texts online. This research enriches the lexicon already known on the topic. The second aspect, the content, revealed the topics like threat, harassment, assault, or suicide. They are subcategories of a broader harmful content which is a constant concern for task forces and legislators at national and international levels. These topic – outcomes of the dataset are a valuable source of detection. The analysis of content revealed algorithms and lexicons which could be applied to other harmful contents. A third outcome of content are the conveyances of Stylistics, which is a rich source of discourse analysis of social media platforms. In conclusion, this corpus linguistics is structured on legislative criteria and could be used in various fields.

Keywords: corpus linguistics, cyberbullying, legislation, natural language processing, twitter

Procedia PDF Downloads 60
455 Grammatical and Lexical Explorations on ‘Outer Circle’ Englishes and ‘Expanding Circle’ Englishes: A Corpus-Based Comparative Analysis

Authors: Orlyn Joyce D. Esquivel

Abstract:

This study analyzed 50 selected research papers from professional language and linguistic academic journals to portray the differences between Kachru’s (1994) outer circle and expanding circle Englishes. The selected outer circle Englishes include those of Bangladesh, Malaysia, the Philippines, India, and Singapore; and the selected expanding circle Englishes are those of China, Indonesia, Japan, Korea, and Thailand. The researcher built ten corpora (five research papers for each corpus) to represent each variety of Englishes. The corpora were examined under grammatical and lexical features using Modified English TreeTagger in Sketch Engine. Results revealed the distinct grammatical and lexical features through the table and textual analyses, illustrated from the most to least dominant linguistic elements. In addition, comparative analyses were done to distinguish the features of each of the selected Englishes. The Language Change Theory was used as a basis in the discussion. Hence, the findings suggest that the ‘outer circle’ Englishes and ‘expanding circle’ Englishes will continue to drift from International English.

Keywords: applied linguistics, English as a global language, expanding circle Englishes, global Englishes, outer circle Englishes

Procedia PDF Downloads 134
454 Affects Associations Analysis in Emergency Situations

Authors: Joanna Grzybowska, Magdalena Igras, Mariusz Ziółko

Abstract:

Association rule learning is an approach for discovering interesting relationships in large databases. The analysis of relations, invisible at first glance, is a source of new knowledge which can be subsequently used for prediction. We used this data mining technique (which is an automatic and objective method) to learn about interesting affects associations in a corpus of emergency phone calls. We also made an attempt to match revealed rules with their possible situational context. The corpus was collected and subjectively annotated by two researchers. Each of 3306 recordings contains information on emotion: (1) type (sadness, weariness, anxiety, surprise, stress, anger, frustration, calm, relief, compassion, contentment, amusement, joy) (2) valence (negative, neutral, or positive) (3) intensity (low, typical, alternating, high). Also, additional information, that is a clue to speaker’s emotional state, was annotated: speech rate (slow, normal, fast), characteristic vocabulary (filled pauses, repeated words) and conversation style (normal, chaotic). Exponentially many rules can be extracted from a set of items (an item is a previously annotated single information). To generate the rules in the form of an implication X → Y (where X and Y are frequent k-itemsets) the Apriori algorithm was used - it avoids performing needless computations. Then, two basic measures (Support and Confidence) and several additional symmetric and asymmetric objective measures (e.g. Laplace, Conviction, Interest Factor, Cosine, correlation coefficient) were calculated for each rule. Each applied interestingness measure revealed different rules - we selected some top rules for each measure. Owing to the specificity of the corpus (emergency situations), most of the strong rules contain only negative emotions. There are though strong rules including neutral or even positive emotions. Three examples of the strongest rules are: {sadness} → {anxiety}; {sadness, weariness, stress, frustration} → {anger}; {compassion} → {sadness}. Association rule learning revealed the strongest configurations of affects (as well as configurations of affects with affect-related information) in our emergency phone calls corpus. The acquired knowledge can be used for prediction to fulfill the emotional profile of a new caller. Furthermore, a rule-related possible context analysis may be a clue to the situation a caller is in.

Keywords: data mining, emergency phone calls, emotional profiles, rules

Procedia PDF Downloads 392
453 The Contribution of Corpora to the Investigation of Cross-Linguistic Equivalence in Phraseology: A Contrastive Analysis of Russian and Italian Idioms

Authors: Federica Floridi

Abstract:

The long tradition of contrastive idiom research has essentially been focusing on three domains: the comparison of structural types of idioms (e.g. verbal idioms, idioms with noun-phrase structure, etc.), the description of idioms belonging to the same thematic groups (Sachgruppen), the identification of different types of cross-linguistic equivalents (i.e. full equivalents, partial equivalents, phraseological parallels, non-equivalents). The diastratic, diachronic and diatopic aspects of the compared idioms, as well as their syntactic, pragmatic and semantic properties, have been rather ignored. Corpora (both monolingual and parallel) give the opportunity to investigate the actual use of correlating idioms in authentic texts of L1 and L2. Adopting the corpus-based approach, it is possible to draw attention to the frequency of occurrence of idioms, their syntactic embedding, their potential syntactic transformations (e.g., nominalization, passivization, relativization, etc.), their combinatorial possibilities, the variations of their lexical structure, their connotations in terms of stylistic markedness or register. This paper aims to present the results of a contrastive analysis of Russian and Italian idioms referring to the concepts of ‘beginning’ and ‘end’, that has been carried out by using the Russian National Corpus and the ‘La Repubblica’ corpus. Beyond the digital corpora, bilingual dictionaries, like Skvorcova - Majzel’, Dobrovol’skaja, Kovalev, Čerdanceva, as well as monolingual resources, have been consulted. The study has shown that many of the idioms that have been traditionally indicated as cross-linguistic equivalents on bilingual dictionaries cannot be considered correspondents. The findings demonstrate that even those idioms, that are formally identical in Russian and Italian and are presumably derived from the same source (e.g., conceptual metaphor, Bible, classical mythology, World literature), exhibit differences regarding usage. The ultimate purpose of this article is to highlight that it is necessary to review and improve the existing bilingual dictionaries considering the empirical data collected in corpora. The materials gathered in this research can contribute to this sense.

Keywords: corpora, cross-linguistic equivalence, idioms, Italian, Russian

Procedia PDF Downloads 119
452 Compensatory Articulation of Pressure Consonants in Telugu Cleft Palate Speech: A Spectrographic Analysis

Authors: Indira Kothalanka

Abstract:

For individuals born with a cleft palate (CP), there is no separation between the nasal cavity and the oral cavity, due to which they cannot build up enough air pressure in the mouth for speech. Therefore, it is common for them to have speech problems. Common cleft type speech errors include abnormal articulation (compensatory or obligatory) and abnormal resonance (hyper, hypo and mixed nasality). These are generally resolved after palate repair. However, in some individuals, articulation problems do persist even after the palate repair. Such individuals develop variant articulations in an attempt to compensate for the inability to produce the target phonemes. A spectrographic analysis is used to investigate the compensatory articulatory behaviours of pressure consonants in the speech of 10 Telugu speaking individuals aged between 7-17 years with a history of cleft palate. Telugu is a Dravidian language which is spoken in Andhra Pradesh and Telangana states in India. It is a language with the third largest number of native speakers in India and the most spoken Dravidian language. The speech of the informants is analysed using single word list, sentences, passage and conversation. Spectrographic analysis is carried out using PRAAT, speech analysis software. The place and manner of articulation of consonant sounds is studied through spectrograms with the help of various acoustic cues. The types of compensatory articulation identified are glottal stops, palatal stops, uvular, velar stops and nasal fricatives which are non-native in Telugu.

Keywords: cleft palate, compensatory articulation, spectrographic analysis, PRAAT

Procedia PDF Downloads 423
451 Multilingualism and the Creation of New Languages: The Case of Camfranglais Spoken in Italy and Germany

Authors: Jocelyne Kenne Kenne

Abstract:

Previous works in the field of sociolinguistics have explored the various outcomes of linguistic pluralism. One of these outcomes is the creation of new languages. The presentation will focus on one of such languages, Camfranglais, a hybrid language spoken by Cameroonians. It appeared in the 1970s in the francophone area in Cameroon and developed as a result of interactions between French, English, Cameroonian Pidgin English and local Cameroonian languages, all languages spoken in Cameroon. With the migration of Cameroonians to Europe, researches have been conducted to analyze the sociolinguistic profile of Cameroonians in their new environment. The emphasis on this presentation will be on two recent studies that have been conducted to analyze the peculiarity of Camfranglais in two European countries: Germany and Italy. The research involved 59 Cameroonians living in Italy and 49 Cameroonians residing in Germany. The respondents were composed of participants from different linguistic background, students and workers, married and single. A combination of quantitative and qualitative research methods was employed. The field study was divided into three parts. The first part was focused on observing the Cameroonians interact in different places such as in canteens, in the university halls of residence, lecture theatres, at homes, at various Cameroonian meetings. Those observations were accompanied by audio-recordings of the various interactions. The aim was to study communication between Cameroonians to see whether they use Camfranglais or not; if yes, in which domains and what were the speakers’ linguistic profiles. Additionally, questionnaires of different lengths were used to collect biographical information concerning the participants and their sociolinguistic profile and finally, in-depth interviews with Cameroonians were conducted to inquire about the use, the functions and the importance of this language in the migratory context. The results of the research demonstrate how a widespread use of Camfranglais by Cameroonians in Germany and Italy reveal a longing for home on the one hand and a sign of belonging on the other. It also shows the differences that exist between the profiles of Camfranglais speakers in Europe and the speakers in Cameroon notably in terms of age and social class. Finally, it points out some differences in the use, the structure and the functions of this hybrid language in the migratory setting. This study is a contribution to existing research in the field of contact languages and can serve as a comparison for other situations of multilingualism and the creation of mixed languages. Furthermore, with globalization, the study of migrant languages and the contact of these languages with new languages are topics that might be productive for further research in the field of sociolinguistics.

Keywords: interaction, migrants language, multilingualism, mixed languages

Procedia PDF Downloads 191
450 Retrospective Insight on the Changing Status of the Romanian Language Spoken in the Republic of Moldova

Authors: Gina Aurora Necula

Abstract:

From its transformation into a taboo and its hiding under the so-called “Moldovan language” or under the euphemistic expression “state language” to its regained status recognition as an official language, the Romanian language spoken in the Republic of Moldova has undergone impressive reforms in the last 60 years. Meant to erase the awareness of citizens’ ethnic identity and turn a majority language into a minority one, all the laws and regulations issued on the field succeeded into setting numerous barriers for speakers of Romanian. Either manifested as social constraints or materialized into assumed rejection of mother tongue usage, all these laws have demonstrated their usefulness and major impact on the Romanian-speaking population. This article is the result of our research carried out over 10 years with the support of students, and Moldovan citizens, from the master's degree program "Romanian language - identity and cultural awareness." We present here a retrospective insight of the reforms, laws, and regulations that contributed to the shifted status of the Romanian language from the official language, seen as the language of common use both in the public and private spheres, in the minority language that surrendered its privileged place to the Russian language, firstly in the public sphere, and then, slowly but surely, in the private sphere. Our main goal here is to identify and make speakers understand what the barriers to learning Romanian language are nowadays when the social pressure on using Russian no longer exists.

Keywords: linguistic barriers, lingua franca, private sphere, public sphere, reformation

Procedia PDF Downloads 92
449 Investigating Translations of Websites of Pakistani Public Offices

Authors: Sufia Maroof

Abstract:

This empirical study investigated the web-translations of five Pakistani public offices (FPSC, FIA, HEC, USB, and Ministry of Finance) offering Urdu tab as an option to access information on their official websites. Triangulation of quantitative and qualitative research design informed the researcher of the semantic, lexical and syntactic caveats in these translations. The study hypothesized that majority of the Pakistani population is oblivious of the Supreme Court’s amendments in language policy concerning national and official language; hence, Urdu web-translations of the public departments have not been accessed effectively. Firstly, the researcher conducted an online survey, comprising of two sections, close ended and short answer based questions. Secondly, the researcher compiled corpus of the five selected websites in a tabular form to compare the data. Thirdly, the administrators of the departments had been contacted regarding the methods of translation and the expertise of the personnel involved. The corpus was assessed for TQA after examining the lexical, semantic, syntactical and technical alignment inaccuracies and imperfections. The study suggests the public offices to invest in their Urdu webs by either hiring expert translators or engaging expertise of a translation agency for this project to offer quality translation to public.

Keywords: machine translations, public offices, Urdu translations, websites

Procedia PDF Downloads 102
448 Diagnosis of Alzheimer Diseases in Early Step Using Support Vector Machine (SVM)

Authors: Amira Ben Rabeh, Faouzi Benzarti, Hamid Amiri, Mouna Bouaziz

Abstract:

Alzheimer is a disease that affects the brain. It causes degeneration of nerve cells (neurons) and in particular cells involved in memory and intellectual functions. Early diagnosis of Alzheimer Diseases (AD) raises ethical questions, since there is, at present, no cure to offer to patients and medicines from therapeutic trials appear to slow the progression of the disease as moderate, accompanying side effects sometimes severe. In this context, analysis of medical images became, for clinical applications, an essential tool because it provides effective assistance both at diagnosis therapeutic follow-up. Computer Assisted Diagnostic systems (CAD) is one of the possible solutions to efficiently manage these images. In our work; we proposed an application to detect Alzheimer’s diseases. For detecting the disease in early stage we used the three sections: frontal to extract the Hippocampus (H), Sagittal to analysis the Corpus Callosum (CC) and axial to work with the variation features of the Cortex(C). Our method of classification is based on Support Vector Machine (SVM). The proposed system yields a 90.66% accuracy in the early diagnosis of the AD.

Keywords: Alzheimer Diseases (AD), Computer Assisted Diagnostic(CAD), hippocampus, Corpus Callosum (CC), cortex, Support Vector Machine (SVM)

Procedia PDF Downloads 352
447 Differences in Assessing Hand-Written and Typed Student Exams: A Corpus-Linguistic Study

Authors: Jutta Ransmayr

Abstract:

The digital age has long arrived at Austrian schools, so both society and educationalists demand that digital means should be integrated accordingly to day-to-day school routines. Therefore, the Austrian school-leaving exam (A-levels) can now be written either by hand or by using a computer. However, the choice of writing medium (pen and paper or computer) for written examination papers, which are considered 'high-stakes' exams, raises a number of questions that have not yet been adequately investigated and answered until recently, such as: What effects do the different conditions of text production in the written German A-levels have on the component of normative linguistic accuracy? How do the spelling skills of German A-level papers written with a pen differ from those that the students wrote on the computer? And how is the teacher's assessment related to this? Which practical desiderata for German didactics can be derived from this? In a trilateral pilot project of the Austrian Center for Digital Humanities (ACDH) of the Austrian Academy of Sciences and the University of Vienna in cooperation with the Austrian Ministry of Education and the Council for German Orthography, these questions were investigated. A representative Austrian learner corpus, consisting of around 530 German A-level papers from all over Austria (pen and computer written), was set up in order to subject it to a quantitative (corpus-linguistic and statistical) and qualitative investigation with regard to the spelling and punctuation performance of the high school graduates and the differences between pen- and computer-written papers and their assessments. Relevant studies are currently available mainly from the Anglophone world. These have shown that writing on the computer increases the motivation to write, has positive effects on the length of the text, and, in some cases, also on the quality of the text. Depending on the writing situation and other technical aids, better results in terms of spelling and punctuation could also be found in the computer-written texts as compared to the handwritten ones. Studies also point towards a tendency among teachers to rate handwritten texts better than computer-written texts. In this paper, the first comparable results from the German-speaking area are to be presented. Research results have shown that, on the one hand, there are significant differences between handwritten and computer-written work with regard to performance in orthography and punctuation. On the other hand, the corpus linguistic investigation and the subsequent statistical analysis made it clear that not only the teachers' assessments of the students’ spelling performance vary enormously but also the overall assessments of the exam papers – the factor of the production medium (pen and paper or computer) also seems to play a decisive role.

Keywords: exam paper assessment, pen and paper or computer, learner corpora, linguistics

Procedia PDF Downloads 148
446 A Corpus-based Study of Adjuncts in Colombian English as a Second Language (ESL) Argumentative Essays

Authors: E. Velasco

Abstract:

Meeting high standards of writing in a Second Language (L2) is extremely important for many students who wish to undertake studies at universities in both English and non-English speaking countries. University lecturers in English speaking countries continue to express dissatisfaction with the apparent poor quality of essay writing skills displayed by English as a Second Language (ESL) students, whose essays are often criticised for their lack of cohesion and coherence. These critiques have extended to contexts such as Colombia, where many ESL students are criticised for their inability to write high-quality academic texts in L2-English, particularly at the tertiary level. If Colombian ESL students are expected to meet high standards of writing when studying locally and abroad, it makes sense to carry out specific research that can perhaps lead to recommendations to support their quest for improving argumentative strategies. Employing Corpus Linguistics methods within a Learner Corpus Research framework, and a combination of Log-Likelihood and Bayes Factor measures, this paper investigated argumentative essays written by Colombian ESL students. The study specifically aimed to analyse conjunctive adjuncts in argumentative essays to find out how Colombian ESL students connect their ideas in discourse. Results suggest that a) Colombian ESL learners need explicit instruction on specific areas of conjunctive adjuncts to counteract overuse, underuse and misuse; b) underuse of endophoric and evidential adjuncts highlights gaps between IELTS-like essays and good quality tertiary-level essays and published papers, and these gaps are linked to prior knowledge brought into writing task, rhetorical functions in writing, and research processes before writing takes place; c) both Colombian ESL learners and L1-English writers (in a reference corpus) overuse some adjuncts and underuse endophoric and evidential adjuncts, when compared to skilled L1-English and L2-English writers, so differences in frequencies of adjuncts has little to do with the writers’ L1, and differences are rather linked to types of essays writers produce (e.g. ESL vs. university essays). Ender Velasco: The pedagogical recommendations deriving from the study are that: a) Colombian ESL learners need to be shown that overuse is not the only way of giving cohesion to argumentative essays and there are other alternatives to cohesion (e.g., implicit adjuncts, lexical chains and collocations); b) syllabi and classroom input need to raise awareness of gaps in writing skills between IELTS-like and tertiary-level argumentative essays, and of how endophoric and evidential adjuncts are used to refer to anaphoric and cataphoric sections of essays, and to other people’s work or ideas; c) syllabi and classroom input need to include essay-writing tasks based on previous research/reading which learners need to incorporate into their arguments, and tasks that raise awareness of referencing systems (e.g., APA); d) classroom input needs to include explicit instruction on use of punctuation, functions and/or syntax with specific conjunctive adjuncts such as for example, for that reason, although, despite and nevertheless.

Keywords: argumentative essays, colombian english as a second language (esl) learners, conjunctive adjuncts, corpus linguistics

Procedia PDF Downloads 57
445 The Analysis of One Million Reddit Confessions Corpus: The Use of Emotive Verbs and First Person Singular Pronoun as Linguistic Psychotherapy Features

Authors: Natalia Wojarnik

Abstract:

The paper aims to present the analysis of a Reddit confessions corpus. The interpretation focuses on the use of emotional language, in particular emotive verbs, in the context of personal pronouns. The analysis of the linguistic properties answers the question of what the Reddit users confess about and who is the subject of confessions. The study reveals that the specific language patterns used in Reddit confessions reflect the language of depression and the language used by patients during different stages of their psychotherapy sessions. The paper concludes that Reddit users are more willing to confess about their own experiences, not rarely very private and intimate, extensively using the first person singular pronoun I. It indicates that the Reddit users use the language of depression and the language used by psychotherapy patients. The language they use is very emotionally impacted and includes many emotive verbs such as want, feel, need, hate, love. This finding in Reddit confessions correlates with the extensive use of stative affective verbs in the first stages of the psychotherapy sessions. Lastly, the paper refers to the positive and negative lexicon and helps determine how online posts can serve as a depression detector and “talking cure” for the users.

Keywords: confessions, emotional language, emotive verbs, pronouns, first person pronoun, language of depression, depression detection, psychotherapy language

Procedia PDF Downloads 99
444 Towards Kurdish Internet Linguistics: A Case Study on the Impact of Social Media on Kurdish Language

Authors: Karwan K. Abdalrahman

Abstract:

Due to the impacts of the internet and social media, new words and expressions enter the Kurdish language, and a number of familiarized words get new meanings. The case is especially true when the technique of transliteration is taken into consideration. Through transliteration, a number of selected words widely used on social media are entering the Kurdish media discourse. In addition, a selected number of Kurdish words get new cultural and psychological meanings. The significance of this study is to delve into the process of word formation in the Kurdish language and explore how new words and expressions are formed by social media users and got public recognition. First, the study investigates the English words that enter the Kurdish language through different social media platforms. All of these words are transliterated and are used in spoken and written discourses. Second, there are a specific number of Kurdish words that got new meanings in social media. As for these words, there are psychological and cultural factors that make people use these expressions for specific political reasons. It can be argued that they have an indirect political message along with their new linguistic usages. This is a qualitative study analyzing video content that was published in the last two years on social media platforms, including Facebook and YouTube. The collected data was analyzed based on the themes discussed above. The findings of the research can be summarized as follows: the widely used transliterated words have entered both the spoken and written discourses. Authors in online and offline newspapers, TV presenters, literary writers, columnists are using these new expressions in their writings. As for the Kurdish words with new meanings, they are also widely used for psychological, cultural, and political reasons.

Keywords: Kurdish language, social media, new meanings, transliteration, vocabulary

Procedia PDF Downloads 162
443 The Usage of Negative Emotive Words in Twitter

Authors: Martina Katalin Szabó, István Üveges

Abstract:

In this paper, the usage of negative emotive words is examined on the basis of a large Hungarian twitter-database via NLP methods. The data is analysed from a gender point of view, as well as changes in language usage over time. The term negative emotive word refers to those words that, on their own, without context, have semantic content that can be associated with negative emotion, but in particular cases, they may function as intensifiers (e.g. rohadt jó ’damn good’) or a sentiment expression with positive polarity despite their negative prior polarity (e.g. brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’. Based on the findings of several authors, the same phenomenon can be found in other languages, so it is probably a language-independent feature. For the recent analysis, 67783 tweets were collected: 37818 tweets (19580 tweets written by females and 18238 tweets written by males) in 2016 and 48344 (18379 tweets written by females and 29965 tweets written by males) in 2021. The goal of the research was to make up two datasets comparable from the viewpoint of semantic changes, as well as from gender specificities. An exhaustive lexicon of Hungarian negative emotive intensifiers was also compiled (containing 214 words). After basic preprocessing steps, tweets were processed by ‘magyarlanc’, a toolkit is written in JAVA for the linguistic processing of Hungarian texts. Then, the frequency and collocation features of all these words in our corpus were automatically analyzed (via the analysis of parts-of-speech and sentiment values of the co-occurring words). Finally, the results of all four subcorpora were compared. Here some of the main outcomes of our analyses are provided: There are almost four times fewer cases in the male corpus compared to the female corpus when the negative emotive intensifier modified a negative polarity word in the tweet (e.g., damn bad). At the same time, male authors used these intensifiers more frequently, modifying a positive polarity or a neutral word (e.g., damn good and damn big). Results also pointed out that, in contrast to female authors, male authors used these words much more frequently as a positive polarity word as well (e.g., brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’). We also observed that male authors use significantly fewer types of emotive intensifiers than female authors, and the frequency proportion of the words is more balanced in the female corpus. As for changes in language usage over time, some notable differences in the frequency and collocation features of the words examined were identified: some of the words collocate with more positive words in the 2nd subcorpora than in the 1st, which points to the semantic change of these words over time.

Keywords: gender differences, negative emotive words, semantic changes over time, twitter

Procedia PDF Downloads 181