Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 3650

Search results for: cartographic language

2540 Investigating the Online Effect of Language on Gesture in Advanced Bilinguals of Two Structurally Different Languages in Comparison to L1 Native Speakers of L2 and Explores Whether Bilinguals Will Follow Target L2 Patterns in Speech and Co-speech

Authors: Armita Ghobadi, Samantha Emerson, Seyda Ozcaliskan

Abstract:

Being a bilingual involves mastery of both speech and gesture patterns in a second language (L2). We know from earlier work in first language (L1) production contexts that speech and co-speech gesture form a tightly integrated system: co-speech gesture mirrors the patterns observed in speech, suggesting an online effect of language on nonverbal representation of events in gesture during the act of speaking (i.e., “thinking for speaking”). Relatively less is known about the online effect of language on gesture in bilinguals speaking structurally different languages. The few existing studies—mostly with small sample sizes—suggests inconclusive findings: some show greater achievement of L2 patterns in gesture with more advanced L2 speech production, while others show preferences for L1 gesture patterns even in advanced bilinguals. In this study, we focus on advanced bilingual speakers of two structurally different languages (Spanish L1 with English L2) in comparison to L1 English speakers. We ask whether bilingual speakers will follow target L2 patterns not only in speech but also in gesture, or alternatively, follow L2 patterns in speech but resort to L1 patterns in gesture. We examined this question by studying speech and gestures produced by 23 advanced adult Spanish (L1)-English (L2) bilinguals (Mage=22; SD=7) and 23 monolingual English speakers (Mage=20; SD=2). Participants were shown 16 animated motion event scenes that included distinct manner and path components (e.g., "run over the bridge"). We recorded and transcribed all participant responses for speech and segmented it into sentence units that included at least one motion verb and its associated arguments. We also coded all gestures that accompanied each sentence unit. We focused on motion event descriptions as it shows strong crosslinguistic differences in the packaging of motion elements in speech and co-speech gesture in first language production contexts. English speakers synthesize manner and path into a single clause or gesture (he runs over the bridge; running fingers forward), while Spanish speakers express each component separately (manner-only: el corre=he is running; circle arms next to body conveying running; path-only: el cruza el puente=he crosses the bridge; trace finger forward conveying trajectory). We tallied all responses by group and packaging type, separately for speech and co-speech gesture. Our preliminary results (n=4/group) showed that productions in English L1 and Spanish L1 differed, with greater preference for conflated packaging in L1 English and separated packaging in L1 Spanish—a pattern that was also largely evident in co-speech gesture. Bilinguals’ production in L2 English, however, followed the patterns of the target language in speech—with greater preference for conflated packaging—but not in gesture. Bilinguals used separated and conflated strategies in gesture in roughly similar rates in their L2 English, showing an effect of both L1 and L2 on co-speech gesture. Our results suggest that online production of L2 language has more limited effects on L2 gestures and that mastery of native-like patterns in L2 gesture might take longer than native-like L2 speech patterns.

Keywords: bilingualism, cross-linguistic variation, gesture, second language acquisition, thinking for speaking hypothesis

Procedia PDF Downloads 49

2539 Impact of Map Generalization in Spatial Analysis

Authors: Lin Li, P. G. R. N. I. Pussella

Abstract:

When representing spatial data and their attributes on different types of maps, the scale plays a key role in the process of map generalization. The process is consisted with two main operators such as selection and omission. Once some data were selected, they would undergo of several geometrical changing processes such as elimination, simplification, smoothing, exaggeration, displacement, aggregation and size reduction. As a result of these operations at different levels of data, the geometry of the spatial features such as length, sinuosity, orientation, perimeter and area would be altered. This would be worst in the case of preparation of small scale maps, since the cartographer has not enough space to represent all the features on the map. What the GIS users do is when they wanted to analyze a set of spatial data; they retrieve a data set and does the analysis part without considering very important characteristics such as the scale, the purpose of the map and the degree of generalization. Further, the GIS users use and compare different maps with different degrees of generalization. Sometimes, GIS users are going beyond the scale of the source map using zoom in facility and violate the basic cartographic rule 'it is not suitable to create a larger scale map using a smaller scale map'. In the study, the effect of map generalization for GIS analysis would be discussed as the main objective. It was used three digital maps with different scales such as 1:10000, 1:50000 and 1:250000 which were prepared by the Survey Department of Sri Lanka, the National Mapping Agency of Sri Lanka. It was used common features which were on above three maps and an overlay analysis was done by repeating the data with different combinations. Road data, River data and Land use data sets were used for the study. A simple model, to find the best place for a wild life park, was used to identify the effects. The results show remarkable effects on different degrees of generalization processes. It can see that different locations with different geometries were received as the outputs from this analysis. The study suggests that there should be reasonable methods to overcome this effect. It can be recommended that, as a solution, it would be very reasonable to take all the data sets into a common scale and do the analysis part.

Keywords: generalization, GIS, scales, spatial analysis

Procedia PDF Downloads 309

2538 The Study of Formal and Semantic Errors of Lexis by Persian EFL Learners

Authors: Mohammad J. Rezai, Fereshteh Davarpanah

Abstract:

Producing a text in a language which is not one’s mother tongue can be a demanding task for language learners. Examining lexical errors committed by EFL learners is a challenging area of investigation which can shed light on the process of second language acquisition. Despite the considerable number of investigations into grammatical errors, few studies have tackled formal and semantic errors of lexis committed by EFL learners. The current study aimed at examining Persian learners’ formal and semantic errors of lexis in English. To this end, 60 students at three different proficiency levels were asked to write on 10 different topics in 10 separate sessions. Finally, 600 essays written by Persian EFL learners were collected, acting as the corpus of the study. An error taxonomy comprising formal and semantic errors was selected to analyze the corpus. The formal category covered misselection and misformation errors, while the semantic errors were classified into lexical, collocational and lexicogrammatical categories. Each category was further classified into subcategories depending on the identified errors. The results showed that there were 2583 errors in the corpus of 9600 words, among which, 2030 formal errors and 553 semantic errors were identified. The most frequent errors in the corpus included formal error commitment (78.6%), which were more prevalent at the advanced level (42.4%). The semantic errors (21.4%) were more frequent at the low intermediate level (40.5%). Among formal errors of lexis, the highest number of errors was devoted to misformation errors (98%), while misselection errors constituted 2% of the errors. Additionally, no significant differences were observed among the three semantic error subcategories, namely collocational, lexical choice and lexicogrammatical. The results of the study can shed light on the challenges faced by EFL learners in the second language acquisition process.

Keywords: collocational errors, lexical errors, Persian EFL learners, semantic errors

Procedia PDF Downloads 115

2537 Equivalences and Contrasts in the Morphological Formation of Echo Words in Two Indo-Aryan Languages: Bengali and Odia

Authors: Subhanan Mandal, Bidisha Hore

Abstract:

The linguistic process whereby repetition of all or part of the base word with or without internal change before or after the base itself takes place is regarded as reduplication. The reduplicated morphological construction annotates with itself a new grammatical category and meaning. Reduplication is a very frequent and abundant phenomenon in the eastern Indian languages from the states of West Bengal and Odisha, i.e. Bengali and Odia respectively. Bengali, an Indo-Aryan language and a part of the Indo-European language family is one of the largest spoken languages in India and is the national language of Bangladesh. Despite this classification, Bengali has certain influences in terms of vocabulary and grammar due to its geographical proximity to Tibeto-Burman and Austro-Asiatic language speaking communities. Bengali along with Odia belonged to a single linguistic branch. But with time and gradual linguistic changes due to various factors, Odia was the first to break away and develop as a separate distinct language. However, less of contrasts and more of similarities still exist among these languages along the line of linguistics, leaving apart the script. This paper deals with the procedure of echo word formations in Bengali and Odia. The morphological research of the two languages concerning the field of reduplication reveals several linguistic processes. The revelation is based on the information elicited from native language speakers and also on the analysis of echo words found in discourse and conversational patterns. For the purpose of partial reduplication analysis, prefixed class and suffixed class word formations are taken into consideration which show specific rule based changes. For example, in suffixed class categorization, both consonant and vowel alterations are found, following the rules: i) CVx à tVX, ii) CVCV à CVCi. Further classifications were also found on sentential studies of both languages which revealed complete reduplication complexities while forming echo words where the head word lose its original meaning. Complexities based on onomatopoetic/phonetic imitation of natural phenomena and not according to any rule-based occurrences were also found. Taking these aspects into consideration which are very prevalent in both the languages, inferences are drawn from the study which bring out many similarities in both the languages in this area in spite of branching away from each other several years ago.

Keywords: consonant alteration, onomatopoetic, partial reduplication and complete reduplication, reduplication, vowel alteration

Procedia PDF Downloads 221

2536 Learning Trajectories of Mexican Language Teachers: A Cross-Cultural Comparative Study

Authors: Alberto Mora-Vazquez, Nelly Paulina Trejo Guzmán

Abstract:

This study examines the learning trajectories of twelve language teachers who were former students of a BA in applied linguistics at a Mexican state university. In particular, the study compares the social, academic and professional trajectories of two groups of teachers, six locally raised and educated ones and six repatriated ones from the U.S. Our interest in undertaking this research lies in the wide variety of students’ backgrounds we as professors in the BA program have witnessed throughout the years it has been around. Ever since the academic program started back in 2006, the student population has been made up of students whose backgrounds are highly diverse in terms of English language proficiency level, professional orientations and degree of cross-cultural awareness. Such diversity is further evidenced by the ongoing incorporation of some transnational students who have lived and studied in the United States for a significant period of time before their enrolment in the BA program. This, however, is not an isolated event as other researchers have reported this phenomenon in other TESOL-related programs of Mexican universities in the literature. Therefore, this suggests that their social and educational experiences are quite different from those of their Mexican born and educated counterparts. In addition, an informal comparison of the participation in formal teaching activities of the two groups at the beginning of their careers also suggested that significant differences in teacher training and development needs could also be identified. This issue raised questions about the need to examine the life and learning trajectories of these two groups of student teachers so as to develop an intervention plan aimed at supporting and encouraging their academic and professional advancement based on their particular needs. To achieve this goal, the study makes use of a combination of retrospective life-history research and the analysis of academic documents. The first approach uses interviews for data-collection. Through the use of a narrative life-history interview protocol, teachers were asked about their childhood home context, their language learning and teaching experiences, their stories of studying applied linguistics, and self-description. For the analysis of participants’ educational outcomes, a wide range of academic records, including reports of language proficiency exams results and language teacher training certificates, were used. The analysis revealed marked differences between the two groups of teachers in terms of academic and professional orientations. The locally educated teachers tended to graduate first, to look for further educational opportunities after graduation, to enter the language teaching profession earlier, and to expand their professional development options more than their peers. It is argued that these differences can be explained by their identities, which are made up of the interplay of influences such as their home context, their previous educational experiences and their cultural background. Implications for language teacher trainers and applied linguistics academic program administrators are provided.

Keywords: beginning language teachers, life-history research, Mexican context, transnational students

Procedia PDF Downloads 401

2535 One-Shot Text Classification with Multilingual-BERT

Authors: Hsin-Yang Wang, K. M. A. Salam, Ying-Jia Lin, Daniel Tan, Tzu-Hsuan Chou, Hung-Yu Kao

Abstract:

Detecting user intent from natural language expression has a wide variety of use cases in different natural language processing applications. Recently few-shot training has a spike of usage on commercial domains. Due to the lack of significant sample features, the downstream task performance has been limited or leads to an unstable result across different domains. As a state-of-the-art method, the pre-trained BERT model gathering the sentence-level information from a large text corpus shows improvement on several NLP benchmarks. In this research, we are proposing a method to change multi-class classification tasks into binary classification tasks, then use the confidence score to rank the results. As a language model, BERT performs well on sequence data. In our experiment, we change the objective from predicting labels into finding the relations between words in sequence data. Our proposed method achieved 71.0% accuracy in the internal intent detection dataset and 63.9% accuracy in the HuffPost dataset. Acknowledgment: This work was supported by NCKU-B109-K003, which is the collaboration between National Cheng Kung University, Taiwan, and SoftBank Corp., Tokyo.

Keywords: OSML, BERT, text classification, one shot

Procedia PDF Downloads 81

2534 The Advancements of Transformer Models in Part-of-Speech Tagging System for Low-Resource Tigrinya Language

Authors: Shamm Kidane, Ibrahim Abdella, Fitsum Gaim, Simon Mulugeta, Sirak Asmerom, Natnael Ambasager, Yoel Ghebrihiwot

Abstract:

The call for natural language processing (NLP) systems for low-resource languages has become more apparent than ever in the past few years, with the arduous challenges still present in preparing such systems. This paper presents an improved dataset version of the Nagaoka Tigrinya Corpus for Parts-of-Speech (POS) classification system in the Tigrinya language. The size of the initial Nagaoka dataset was incremented, totaling the new tagged corpus to 118K tokens, which comprised the 12 basic POS annotations used previously. The additional content was also annotated manually in a stringent manner, followed similar rules to the former dataset and was formatted in CONLL format. The system made use of the novel approach in NLP tasks and use of the monolingually pre-trained TiELECTRA, TiBERT and TiRoBERTa transformer models. The highest achieved score is an impressive weighted F1-score of 94.2%, which surpassed the previous systems by a significant measure. The system will prove useful in the progress of NLP-related tasks for Tigrinya and similarly related low-resource languages with room for cross-referencing higher-resource languages.

Keywords: Tigrinya POS corpus, TiBERT, TiRoBERTa, conditional random fields

Procedia PDF Downloads 64

2533 Against Language Disorder: A Way of Reading Dialects in Yan Lianke’s Novels

Authors: Thuy Hanh Nguyen Thi

Abstract:

By the method of deep reading and text analysis, this article will analyze the use and creation of dialects as a way of demonstrating Yan Lianke's creative stance. This article indicates that this is the writer’s narrative strategy in a fight against aphasia, a language disorder of Chinese people and culture, demonstrating a sense of return to folklore and marks his own linguistic style. In terms of verbal text, the dialect in the Yan Lianke’s novels manifested through the use of words, sentences and dialects. There are two types of dialects that exist in Yan Lianke’s novels: the current dialect system and the particular dialect system of Pa Lau world created by the writer himself in order to enrich the vocabulary of Han Chinese.

Keywords: Yan Lianke , aphasia, dialect, Pa Lou world

Procedia PDF Downloads 101

2532 Role of Speech Articulation in English Language Learning

Authors: Khadija Rafi, Neha Jamil, Laiba Khalid, Meerub Nawaz, Mahwish Farooq

Abstract:

Speech articulation is a complex process to produce intelligible sounds with the help of precise movements of various structures within the vocal tract. All these structures in the vocal tract are named as articulators, which comprise lips, teeth, tongue, and palate. These articulators work together to produce a range of distinct phonemes, which happen to be the basis of language. It starts with the airstream from the lungs passing through the trachea and into oral and nasal cavities. When the air passes through the mouth, the tongue and the muscles around it form such coordination it creates certain sounds. It can be seen when the tongue is placed in different positions- sometimes near the alveolar ridge, soft palate, roof of the mouth or the back of the teeth which end up creating unique qualities of each phoneme. We can articulate vowels with open vocal tracts, but the height and position of the tongue is different every time depending upon each vowel, while consonants can be pronounced when we create obstructions in the airflow. For instance, the alphabet ‘b’ is a plosive and can be produced only by briefly closing the lips. Articulation disorders can not only affect communication but can also be a hurdle in speech production. To improve articulation skills for such individuals, doctors often recommend speech therapy, which involves various kinds of exercises like jaw exercises and tongue twisters. However, this disorder is more common in children who are going through developmental articulation issues right after birth, but in adults, it can be caused by injury, neurological conditions, or other speech-related disorders. In short, speech articulation is an essential aspect of productive communication, which also includes coordination of the specific articulators to produce different intelligible sounds, which are a vital part of spoken language.

Keywords: linguistics, speech articulation, speech therapy, language learning

Procedia PDF Downloads 36

2531 Examining the Effect of Online English Lessons on Nursery School Children

Authors: Hidehiro Endo, Taizo Shigemichi

Abstract:

Introduction & Objectives: In 2008, the revised course of study for elementary schools was published by MEXT, and from the beginning of the academic year of 2011-2012, foreign language activities (English lessons) became mandatory for 5th and 6th graders in Japanese elementary schools. Foreign language activities are currently offered once a week for approximately 50 minutes by elementary school teachers, assistant language teachers who are native speakers of English, volunteers, among others, with the purpose of helping children become accustomed to functional English. However, the new policy has disclosed a myriad of issues in conducting foreign language activities since the majority of the current elementary school teachers has neither English teaching experience nor English proficiency. Nevertheless, converting foreign language activities into English, as a subject in Japanese elementary schools (for 5th and 6th graders) from 2020 is what MEXT currently envisages with the purpose of reforming English education in Japan. According to their new proposal, foreign language activities will be mandatory for 3rd and 4th graders from 2020. Consequently, gaining better access to English learning opportunities becomes one of the primary concerns even in early childhood education. Thus, in this project, we aim to explore some nursery schools’ attempts at providing toddlers with online English lessons via Skype. The main purpose of this project is to look deeply into what roles online English lessons in the nursery schools play in guiding nursery school children to enjoy learning the English language as well as to acquire English communication skills. Research Methods: Setting; The main research site is a nursery school located in the northern part of Japan. The nursery school has been offering a 20-minute online English lesson via Skype twice a week to 7 toddlers since September 2015. The teacher of the online English lessons is a male person who lives in the Philippines. Fieldwork & Data; We have just begun collecting data by attending the Skype English lessons. Direct observations are the principal components of the fieldwork. By closely observing how the toddlers respond to what the teacher does via Skype, we examine what components stimulate the toddlers to pay attention to the English lessons. Preliminary Findings & Expected Outcomes: Although both data collection and analysis are ongoing, we found that the online English teacher remembers the first name of each toddler and calls them by their first name via Skype, a technique that is crucial in motivating the toddlers to actively participate in the lessons. In addition, when the teacher asks the toddlers the name of a plastic object such as grapes in English, the toddlers tend to respond to the teacher in Japanese. Accordingly, the effective use of Japanese in teaching English for nursery school children need to be further examined. The anticipated results of this project are an increased recognition of the significance of creating English language learning opportunities for nursery school children and a significant contribution to the field of early childhood education.

Keywords: teaching children, English education, early childhood education, nursery school

Procedia PDF Downloads 304

2530 An Investigation of the Integration of Synchronous Online Tools into Task-Based Language Teaching: The Example of SpeakApps

Authors: Nouf Aljohani

Abstract:

The research project described in this presentation focuses on designing and evaluating oral tasks related to students’ needs and levels to foster communication and negotiation of meaning for a group of female Saudi university students. The significance of the current research project lies in its contribution to determining the usefulness of synchronous technology-mediated interactive group discussion in improving different speaking strategies through using synchronous technology. Also, it discovers how to optimize learning outcomes, expand evaluation for online learning tasks and engaging students’ experience in evaluating synchronous interactive tools and tasks. The researcher used SpeakApps, a synchronous technology, that allows the students to practice oral interaction outside the classroom. Such a course of action was considered necessary due to low English proficiency among Saudi students. According to the author's knowledge, the main factor that causes poor speaking skills is that students do not have sufficient time to communicate outside English language classes. Further, speaking and listening course contents are not well designed to match the Saudi learning context. The methodology included designing speaking tasks to match the educational setting; a CALL framework for designing and evaluating tasks; participant involvement in evaluating these tasks in each online session; and an investigation of the factors that led to the successful implementation of Task-based Language Teaching (TBLT) and using SpeakApps. The analysis and data were drawn from the technology acceptance model surveys, a group interview, teachers’ and students’ weekly reflections, and discourse analysis of students’ interactions.

Keywords: CALL evaluation, synchronous technology, speaking skill, task-based language teaching

Procedia PDF Downloads 292

2529 ExactData Smart Tool For Marketing Analysis

Authors: Aleksandra Jonas, Aleksandra Gronowska, Maciej Ścigacz, Szymon Jadczak

Abstract:

Exact Data is a smart tool which helps with meaningful marketing content creation. It helps marketers achieve this by analyzing the text of an advertisement before and after its publication on social media sites like Facebook or Instagram. In our research we focus on four areas of natural language processing (NLP): grammar correction, sentiment analysis, irony detection and advertisement interpretation. Our research has identified a considerable lack of NLP tools for the Polish language, which specifically aid online marketers. In light of this, our research team has set out to create a robust and versatile NLP tool for the Polish language. The primary objective of our research is to develop a tool that can perform a range of language processing tasks in this language, such as sentiment analysis, text classification, text correction and text interpretation. Our team has been working diligently to create a tool that is accurate, reliable, and adaptable to the specific linguistic features of Polish, and that can provide valuable insights for a wide range of marketers needs. In addition to the Polish language version, we are also developing an English version of the tool, which will enable us to expand the reach and impact of our research to a wider audience. Another area of focus in our research involves tackling the challenge of the limited availability of linguistically diverse corpora for non-English languages, which presents a significant barrier in the development of NLP applications. One approach we have been pursuing is the translation of existing English corpora, which would enable us to use the wealth of linguistic resources available in English for other languages. Furthermore, we are looking into other methods, such as gathering language samples from social media platforms. By analyzing the language used in social media posts, we can collect a wide range of data that reflects the unique linguistic characteristics of specific regions and communities, which can then be used to enhance the accuracy and performance of NLP algorithms for non-English languages. In doing so, we hope to broaden the scope and capabilities of NLP applications. Our research focuses on several key NLP techniques including sentiment analysis, text classification, text interpretation and text correction. To ensure that we can achieve the best possible performance for these techniques, we are evaluating and comparing different approaches and strategies for implementing them. We are exploring a range of different methods, including transformers and convolutional neural networks (CNNs), to determine which ones are most effective for different types of NLP tasks. By analyzing the strengths and weaknesses of each approach, we can identify the most effective techniques for specific use cases, and further enhance the performance of our tool. Our research aims to create a tool, which can provide a comprehensive analysis of advertising effectiveness, allowing marketers to identify areas for improvement and optimize their advertising strategies. The results of this study suggest that a smart tool for advertisement analysis can provide valuable insights for businesses seeking to create effective advertising campaigns.

Keywords: NLP, AI, IT, language, marketing, analysis

Procedia PDF Downloads 56

2528 The Meaning System of Tense: A Systemic Functional Approach

Authors: Cunyu Zhang

Abstract:

Through literature review about studies related to tense, it is found that there exist disagreements on the definition and existence of Chinese tense. Influenced by some researches on English language which regard tense as a grammatical category based on the verbal inflections of English, some Chinese researchers claim that there is no tense in Chinese language as there are no verbal inflections involved. Meanwhile, other Chinese researchers hold that Chinese still has tense although its verbs are non-inflectional based on the fact that Chinese lexical expressions can imply temporal meaning. We assume that the reasons for the above disagreements in terms of Chinese tense lie in the fact that all the previous studies prefer to view language “from the below” which means expressions of tense are the core part of these studies. However, there are about 6,000 languages with distinct expressions all over the world. Hence, if the language studies only concentrate on expressions, it must become more difficult to understand the nature of language. By contrast, functions of languages are similar; otherwise, the human beings could not communicate with each other. Therefore, we believe that it is necessary for us to have a theoretical study on Chinese tense within the framework of SFL which holds that language is a system where meaning is the core part while form is just the realization of meaning. In addition, SFL is a general linguistic providing a universal framework for languages all over the world. Therefore, based on Systemic Functional Linguistics, the paper firstly redefines tense as a deictic semantic category for describing the speaker’s temporal location of processes and relevant temporal relations. With reference to this definition, this study explores the meaning system of tense. It is proposed that tense expresses four kinds of meaning, namely interpersonal, experiential, logical and textual meanings. From the interpersonal angle, tense helps to exchange temporal information between the speaker and the listener, and the temporal information refers to the anchoring of a concerned process in the past, present or future by the speaker. From the experiential angle, tense plays a role in the temporal locating of material, mental, relational, existential, behavioral and verbal processes by the speaker. From the logical angle, tense denotes the temporal relations at the two levels of clause and clause complex, and such relations fall into simultaneity, anteriority and posteriority. From the textual angle, tense refers to the temporal relations at the level of text, and the temporal relations in question concern linear serial relations and synchronous serial relations.

Keywords: Chinese, meaning system, Systemic Functional Linguistics, tense

Procedia PDF Downloads 390

2527 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 117

2526 When the Rubber Hits the Road: The Enactment of Well-Intentioned Language Policy in Digital vs. In Situ Spaces on Washington, DC Public Transportation

Authors: Austin Vander Wel, Katherin Vargas Henao

Abstract:

Washington, DC, is a city in which Spanish, along with several other minority languages, is prevalent not only among tourists but also those living within city limits. In response to this linguistic diversity and DC’s adoption of the Language Access Act in 2004, the Washington Metropolitan Area Transit Authority (WMATA) committed to addressing the need for equal linguistic representation and established a five-step plan to provide the best multilingual information possible for public transportation users. The current study, however, strongly suggests that this de jure policy does not align with the reality of Spanish’s representation on DC public transportation–although perhaps doing so in an unexpected way. In order to investigate Spanish’s de facto representation and how it contrasts with de jure policy, this study implements a linguistic landscapes methodology that takes critical language-policy as its theoretical framework (Tollefson, 2005). Specifically concerning de facto representation, it focuses on the discrepancies between digital spaces and the actual physical spaces through which users travel. These digital vs. in situ conditions are further analyzed by separately addressing aural and visual modalities. In digital spaces, data was collected from WMATA’s website (visual) and their bilingual hotline (aural). For in situ spaces, both bus and metro areas of DC public transportation were explored, with signs comprising the visual modality and recordings, driver announcements, and interactions with metro kiosk workers comprising the aural modality. While digital spaces were considered to successfully fulfill WMATA’s commitment to representing Spanish as outlined in the de jure policy, physical spaces show a large discrepancy between what is said and what is done, particularly regarding the bus system, in addition to the aural modality overall. These discrepancies in situ spaces place Spanish speakers at a clear disadvantage, demanding additional resources and knowledge on the part of residents with limited or no English proficiency in order to have equal access to this public good. Based on our critical language-policy analysis, while Spanish is represented as a right in the de jure policy, its implementation in situ clearly portrays Spanish as a problem since those seeking bilingual information can not expect it to be present when and where they need it most (Ruíz, 1984; Tollefson, 2005). This study concludes with practical, data-based steps to improve the current situation facing DC’s public transportation context and serves as a model for responding to inadequate enactment of de jure policy in other language policy settings.

Keywords: Urban landscape, language access, critical-language policy, spanish, public transportation

Procedia PDF Downloads 50

2525 Thinking for Writing: Evidence of Language Transfer in Chinese ESL Learners’ Written Narratives

Authors: Nan Yang, Hye Pae

Abstract:

English as a second language (ESL) learners are often observed to have transferred traits of their first languages (L1) and habits of using their L1s to their use of English (second language, L2), and this phenomenon is coined as language transfer. In addition to the transfer of linguistic features (e.g., grammar, vocabulary, etc.), which are relatively easy to observe and quantify, many cross-cultural theorists emphasized on a much subtle and fundamental transfer existing on a higher conceptual level that is referred to as conceptual transfer. Although a growing body of literature in linguistics has demonstrated evidence of L1 transfer in various discourse genres, very limited studies address the underlying conceptual transfer that is happening along with the language transfer, especially with the extended form of spontaneous discourses such as personal narrative. To address this issue, this study situates itself in the context of Chinese ESL learners’ written narratives, examines evidence of L1 conceptual transfer in comparison with native English speakers’ narratives, and provides discussion from the perspective of the conceptual transfer. It is hypothesized that Chinese ESL learners’ English narrative strategies are heavily influenced by the strategies that they use in Chinese as a result of the conceptual transfer. Understanding language transfer cognitively is of great significance in the realm of SLA, as it helps address challenges that ESL learners around the world are facing; allow native English speakers to develop a better understanding about how and why learners’ English is different; and also shed light in ESL pedagogy by providing linguistic and cultural expectations in native English-speaking countries. To achieve the goals, 40 college students were recruited (20 Chinese ESL learners and 20 native English speakers) in the United States, and their written narratives on the prompt 'The most frightening experience' were collected for quantitative discourse analysis. 40 written narratives (20 in Chinese and 20 in English) were collected from Chinese ESL learners, and 20 written narratives were collected from native English speakers. All written narratives were coded according to the coding scheme developed by the authors prior to data collection. Statistical descriptive analyses were conducted, and the preliminary results revealed that native English speakers included more narrative elements such as events and explicit evaluation comparing to Chinese ESL students’ both English and Chinese writings; the English group also utilized more evaluation device (i.e., physical state expressions, indirectly reported speeches, delineation) than Chinese ESL students’ both English and Chinese writings. It was also observed that Chinese ESL students included more orientation elements (i.e., the introduction of time/place, the introduction of character) in their Chinese and English writings than the native English-speaking participants. The findings suggest that a similar narrative strategy was observed in Chinese ESL learners’ Chinese narratives and English narratives, which is considered as the evidence of conceptual transfer from Chinese (L1) to English (L2). The results also indicate that distinct narrative strategies were used by Chinese ESL learners and native English speakers as a result of cross-cultural differences.

Keywords: Chinese ESL learners, language transfer, thinking-for-speaking, written narratives

Procedia PDF Downloads 95

2524 The Role of Reading Self-Efficacy and Perception of Difficulty in English Reading among Chinese ESL Learners

Authors: Kevin Chan, Kevin K. H. Chung, Patcy P. S. Yeung, H. L. Ip, Bill T. C. Chung, Karen M. K. Chung

Abstract:

Purpose: Recent evidence shows that reading self-efficacy and students perceived difficulty in reading are significantly associated with word reading and reading fluency. However, little is known about these relationships among students learning to read English as a second language, particularly in Chinese students. This study examined the contributions of reading self-efficacy, perception of difficulty in reading, and cognitive-linguistic skills to performance on English word reading and reading fluency in Chinese students. Method: A sample of 122 second-and third-grade students in Hong Kong, China, participated in this study. Students completed the measures of reading self-efficacy and perception of difficulty in reading. They were assessed on their English cognitive-linguistic and reading skills: rapid automatized naming, nonword reading, phonological awareness, word reading, and one-minute word reading. Results: Results of path analysis indicated that when students’ grades were controlled, reading self-efficacy was a significant correlate of word reading and reading fluency, whereas perception of difficulty in reading negatively predicted word reading. Conclusion: These findings underscore the importance of taking students’ reading self-efficacy and perception of difficulty in reading and their cognitive-linguistic skills into consideration when designing reading intervention and instructions for students learning English as a second language.

Keywords: self-efficacy, perception of difficulty in reading, english as a second language, word reading

Procedia PDF Downloads 168

2523 Simo-syl: A Computer-Based Tool to Identify Language Fragilities in Italian Pre-Schoolers

Authors: Marinella Majorano, Rachele Ferrari, Tamara Bastianello

Abstract:

The recent technological advance allows for applying innovative and multimedia screen-based assessment tools to test children's language and early literacy skills, monitor their growth over the preschool years, and test their readiness for primary school. Several are the advantages that a computer-based assessment tool offers with respect to paper-based tools. Firstly, computer-based tools which provide the use of games, videos, and audio may be more motivating and engaging for children, especially for those with language difficulties. Secondly, computer-based assessments are generally less time-consuming than traditional paper-based assessments: this makes them less demanding for children and provides clinicians and researchers, but also teachers, with the opportunity to test children multiple times over the same school year and, thus, to monitor their language growth more systematically. Finally, while paper-based tools require offline coding, computer-based tools sometimes allow obtaining automatically calculated scores, thus producing less subjective evaluations of the assessed skills and provide immediate feedback. Nonetheless, using computer-based assessment tools to test meta-phonological and language skills in children is not yet common practice in Italy. The present contribution aims to estimate the internal consistency of a computer-based assessment (i.e., the Simo-syl assessment). Sixty-three Italian pre-schoolers aged between 4;10 and 5;9 years were tested at the beginning of the last year of the preschool through paper-based standardised tools in their lexical (Peabody Picture Vocabulary Test), morpho-syntactical (Grammar Repetition Test for Children), meta-phonological (Meta-Phonological skills Evaluation test), and phono-articulatory skills (non-word repetition). The same children were tested through Simo-syl assessment on their phonological and meta-phonological skills (e.g., recognise syllables and vowels and read syllables and words). The internal consistency of the computer-based tool was acceptable (Cronbach's alpha = .799). Children's scores obtained in the paper-based assessment and scores obtained in each task of the computer-based assessment were correlated. Significant and positive correlations emerged between all the tasks of the computer-based assessment and the scores obtained in the CMF (r = .287 - .311, p < .05) and in the correct sentences in the RCGB (r = .360 - .481, p < .01); non-word repetition standardised test significantly correlates with the reading tasks only (r = .329 - .350, p < .05). Further tasks should be included in the current version of Simo-syl to have a comprehensive and multi-dimensional approach when assessing children. However, such a tool represents a good chance for the teachers to early identifying language-related problems even in the school environment.

Keywords: assessment, computer-based, early identification, language-related skills

Procedia PDF Downloads 153

2522 Passive Voice in SLA: Armenian Learners’ Case Study

Authors: Emma Nemishalyan

Abstract:

It is believed that learners’ mother tongue (L1 hereafter) has a huge impact on their second language acquisition (L2 hereafter). This hypothesis has been exposed to both positive and negative criticism. Based on research results of a wide range of learners’ corpora (Chinese, Japanese, Spanish among others) the hypothesis has either been proved or disproved. However, no such study has been conducted on the Armenian learners. The aim of this paper is to understand the implication of the hypothesis on the Armenian learners’ corpus in terms of the use of the passive voice. To this end, the method of Contrastive Interlanguage Analysis (hereafter CIA) has been used on native speakers’ corpus (Louvain Corpus of Native English Essays (LOCNESS)) and Armenian learners’ corpus which has been compiled by me in compliance with International Corpus of Learner English (ICLE) guidelines. CIA compares the interlanguage (the language produced by learners) with the one produced by native speakers. With the help of this method, it is possible not only to highlight the mistakes that learners make, but also to underline the under or overuses. The choice of the grammar issue (passive voice) is conditioned by the fact that typologically Armenian and English are drastically different as they belong to different branches. Moreover, the passive voice is considered to be one of the most problematic grammar topics to be acquired by learners of the English language. Based on this difference, we hypothesized that Armenian learners would either overuse or underuse some types of the passive voice. With the help of Lancsbox software, we have identified the frequency rates of passive voice usage in LOCNESS and Armenian learners’ corpus to understand whether the latter have the same usage pattern of the passive voice as the native speakers. Secondly, we have identified the types of the passive voice used by the Armenian leaners trying to track down the reasons in their mother tongue. The results of the study showed that Armenian learners underused the passive voices in contrast to native speakers. Furthermore, the hypothesis that learners’ L1 has an impact on learners’ L2 acquisition and production was proved.

Keywords: corpus linguistics, applied linguistics, second language acquisition, corpus compilation

Procedia PDF Downloads 67

2521 Implementing a Database from a Requirement Specification

Authors: M. Omer, D. Wilson

Abstract:

Creating a database scheme is essentially a manual process. From a requirement specification, the information contained within has to be analyzed and reduced into a set of tables, attributes and relationships. This is a time-consuming process that has to go through several stages before an acceptable database schema is achieved. The purpose of this paper is to implement a Natural Language Processing (NLP) based tool to produce a from a requirement specification. The Stanford CoreNLP version 3.3.1 and the Java programming were used to implement the proposed model. The outcome of this study indicates that the first draft of a relational database schema can be extracted from a requirement specification by using NLP tools and techniques with minimum user intervention. Therefore, this method is a step forward in finding a solution that requires little or no user intervention.

Keywords: information extraction, natural language processing, relation extraction

Procedia PDF Downloads 235

2520 Unmasking Theatrical Language: Exploring Ideological Connections in American Theater

Authors: Gizem Barreto Martins

Abstract:

This paper explores the subversive potential inherent in the theatrical language employed within Arthur Miller's The Crucible. The research argues that this play intricately weaves ideological connections with its audience and the historical epoch it represents, effectively serving as a channel for ideological and cultural interaction potentially exerting subversive influences on social and political realms. Using a historical-materialist methodology that situates the play within its historical and political context, all while examining its connections with theater and literary theories, the paper raises a fundamental query: How does this dramatic work embody subversion, presenting a style unburdened by the performative conventions of daily life and prevailing codes and systems of representation? In response to this inquiry, the study asserts that theatrical language has the capacity to function as a subversive catalyst against prevailing ideologies, actively contributing to the process of social transformation. To substantiate this claim, the research conducts a detailed analysis of the selected play, employing the semiotic framework pioneered by Gilles Deleuze and Felix Guattari.

Keywords: arthur miller, The crucible, gilles deleuze, felix guattari, theater and literary theories

Procedia PDF Downloads 38

2519 Teacher Training for Bilingual Education of Deaf Students in Brazil

Authors: Mara Aparecida De Castilho Lopes. Maria Eliza Mattosinho Bernardes

Abstract:

The education of deaf individuals in Brazil is grounded in the bilingual approach, which presupposes Brazilian Sign Language (Libras) as the first language for these students. In this perspective, Portuguese should be taught as a second language in its written form, ensuring that deaf students also have access to various academic subjects in sign language. Brazilian legislation (Federal Decree No. 5626 of 2005) mandates the teaching of Brazilian Sign Language in university teacher training programs, but there is no pre-established minimum workload. As a result, there is a significant disparity in the teaching and quality of teacher education across the Brazilian territory. Added to this fact is the general lack of awareness within society regarding the linguistic status of Libras, leading to a shortage of competent teachers for its use and instruction, particularly in higher education. Recently, Federal Law No. 14191 of 2021 established bilingual education for the deaf as a mode of instruction, indicating the need for adjustments in teacher training within higher education teacher preparation programs. Given this context, the objective of the present study was to analyze the teaching proposals for Brazilian Sign Language for students in teacher training programs at public universities in Brazil, presenting alternatives to overcome the current models and academic pathways of teaching and learning. In addition to analyzing Brazilian teaching models, an analysis of a continuing education model for teachers in a French institution was also conducted - considering the historical Franco-Brazilian path of deaf education in Brazil. The analysis of the current teacher training model for deaf education in Brazil revealed that initial exposure to sign language and its linguistic structure is not sufficient to provide future teachers with opportunities to reflect on bilingual teaching methods and practices, as seen in other definitions of bilingualism - bilingual education for proficient listeners in two oral languages. As a result, a training proposal was developed for an experimental interdisciplinary course, integrating the curriculum of an initial and continuing teacher training program alongside the Alfredo Bossi Chair at the University of São Paulo. This proposal is structured into three disciplines, which constitute consecutive moments in teacher education: Fundamental Aspects of Brazilian Sign Language, Bilingual Teaching Methodology, and Teaching Investigation Project - interdisciplinary engagement in the field of deafness. The last offered discipline represents an interdisciplinary supervised internship proposal, considering the multi-professional context that constitutes deaf education within a bilingual approach. In interdisciplinary work within the field of deafness, dialogue between teachers and other professionals who work with deaf students from different perspectives - teachers, speech therapists, and sign language interpreters - is frequently necessary. Through alternative avenues, these actions aim to direct the linguistic development of deaf students within their learning processes. Based on the innovative curriculum proposal described here, the intention is to contribute to the enhancement of teacher education in Brazil, with the goal of ensuring bilingual education for deaf students.

Keywords: bilingual education, teacher training, historical-cultural approach, interdisciplinary education, inclusive education

Procedia PDF Downloads 61

2518 Multilingualism in Medieval Romance: A French Case Study

Authors: Brindusa Grigoriu

Abstract:

Inscribing itself in the field of the history of multilingual communities with a focus on the evolution of language didactics, our paper aims at providing a pragmatic-interactional approach on a corpus proposing to scholars of the international scientific community a relevant text of early modern European literature: the first romance in French, The Conte of Flore and Blanchefleur by Robert d’Orbigny (1150). The multicultural context described by the romance is one in which an Arab-speaking prince, Floire, and his Francophone protégée, Blanchefleur, learn Latin together at the court of Spain and become fluent enough to turn it into the language of their love. This learning process is made up of interactional patterns of affective relevance, in which the proficiency of the protagonists in the domain of emotive acts becomes a matter of linguistic and pragmatic emulation. From five to ten years old, the pupils are efficiently stimulated by their teacher of Latin, Gaidon – a Moorish scholar of the royal entourage – to cultivate their competencies of oral expression and reading comprehension (of Antiquity classics), while enjoying an ever greater freedom of written expression, including the composition of love poems in this second language of culture and emotional education. Another relevant parameter of the educational process at court is that Latin shares its prominent role as a language of culture with French, whose exemplary learner is the (Moorish) queen herself. Indeed, the adult 'First lady' strives to become a pupil benefitting from lifelong learning provided by a fortuitous slave-teacher with little training, her anonymous chambermaid and Blanchefleur’s mother, who, despite her status of a war trophy, enjoys her Majesty’s confidence as a cultural agent of change in linguistic and theological fields. Thus, the two foreign languages taught at Spains’s court, Latin and French – as opposed to Arabic -, suggest a spiritual authority allowing the mutual enrichment of intercultural pioneers of cross-linguistic communication, in the aftermath of religious wars. Durably, and significantly – if not everlastingly – the language of physical violence rooted in intra-cultural solipsism is replaced by two Romance languages which seem to embody, together and yet distinctly, the parlance of peace-making.

Keywords: multilingualism, history of European language learning, French and Latin learners, multicultural context of medieval romance

Procedia PDF Downloads 115

2517 Effect of Distance Education Students Motivation with the Turkish Language and Literature Course

Authors: Meva Apaydin, Fatih Apaydin

Abstract:

Role of education in the development of society is great. Teaching and training started with the beginning of the history and different methods and techniques which have been applied as the time passed and changed everything with the aim of raising the level of learning. In addition to the traditional teaching methods, technology has been used in recent years. With the beginning of the use of internet in education, some problems which could not be soluted till that time has been dealt and it is inferred that it is possible to educate the learners by using contemporary methods as well as traditional methods. As an advantage of technological developments, distance education is a system which paves the way for the students to be educated individually wherever and whenever they like without the needs of physical school environment. Distance education has become prevalent because of the physical inadequacies in education institutions, as a result; disadvantageous circumstances such as social complexities, individual differences and especially geographical distance disappear. What’s more, the high-speed of the feedbacks between teachers and learners, improvement in student motivation because there is no limitation of time, low-cost, the objective measuring and evaluation are on foreground. In spite of the fact that there is teaching beneficences in distance education, there are also limitations. Some of the most important problems are that : Some problems which are highly possible to come across may not be solved in time, lack of eye-contact between the teacher and the learner, so trust-worthy feedback cannot be got or the problems stemming from the inadequate technological background are merely some of them. Courses are conducted via distance education in many departments of the universities in our country. In recent years, giving lectures such as Turkish Language, English, and History in the first grades of the academic departments in the universities is an application which is constantly becoming prevalent. In this study, the application of Turkish Language course via distance education system by analyzing advantages and disadvantages of the distance education system which is based on internet.

Keywords: distance education, Turkish language, motivation, benefits

Procedia PDF Downloads 412

2516 Gender Differences in Communication Styles: An Analysis of the Language of Earnings Conference Calls

Authors: Chiara De Amicis, Sonia Falconieri, Mesut Tastan

Abstract:

In this study, we analyze the language employed by Chief Executive Officers (CEOs) and Chief Financial Officers (CFOs) during earnings conference calls from a gender perspective. We find evidences that conference calls held by female CEOs and/or CFOs exhibit a higher level of optimism compared to conference calls held by male CEOs and/or CFOs. Moreover, female managers tend to present and discuss firm performances with less vagueness as compared to their male colleagues. We then observe the market reaction around each earnings conference call: while manager optimism is perceived as a good signal by investors, manager vagueness significantly dampens the market reaction around the call. Whether the gender of the CEO and/or the CFO delivering the conference call affects investors’ perceptions about the firm performance is still an open question. Some evidences show that the language employed by female managers conveys more valuable information for market participants as compared to the language employed by their male counterparts. This study contributes to a growing literature in finance and accounting that uses textual analysis to assess the informativeness of corporate disclosure. To our knowledge, this is the first paper that aims at answering the question whether the gender of firm’s top managers does matter when it comes to assess the informativeness of corporate spoken communication. We believe that our results will be of relevance for future research in the field. Moreover, our evidence may be used in support of the debate if a larger participation by women in the management of companies should be encouraged or not.

Keywords: conference calls, even study, gender, market reaction, textual analysis

Procedia PDF Downloads 165

2515 Ambiguity-Identification Prompting for Large Language Model to Better Understand Complex Legal Texts

Authors: Haixu Yu, Wenhui Cao

Abstract:

Tailoring Large Language Models (LLMs) to perform legal reasoning has been a popular trend in the study of AI and law. Researchers have mainly employed two methods to unlock the potential of LLMs, namely by finetuning the LLMs to expand their knowledge of law and by restructuring the prompts (In-Context Learning) to optimize the LLMs’ understanding of the legal questions. Although claiming the finetuning and renovated prompting can make LLMs more competent in legal reasoning, most state-of-the-art studies show quite limited improvements of practicability. In this paper, drawing on the study of the complexity and low interpretability of legal texts, we propose a prompting strategy based on the Chain of Thought (CoT) method. Instead of merely instructing the LLM to reason “step by step”, the prompting strategy requires the tested LLM to identify the ambiguity in the questions as the first step and then allows the LLM to generate corresponding answers in line with different understandings of the identified terms as the following step. The proposed prompting strategy attempts to encourage LLMs to "interpret" the given text from various aspects. Experiments that require the LLMs to answer “case analysis” questions of bar examination with general LLMs such as GPT 4 and legal LLMs such as LawGPT show that the prompting strategy can improve LLMs’ ability to better understand complex legal texts.

Keywords: ambiguity-identification, prompt, large language model, legal text understanding

Procedia PDF Downloads 29

2514 Language Development and Growing Spanning Trees in Children Semantic Network

Authors: Somayeh Sadat Hashemi Kamangar, Fatemeh Bakouie, Shahriar Gharibzadeh

Abstract:

In this study, we target to exploit Maximum Spanning Trees (MST) of children's semantic networks to investigate their language development. To do so, we examine the graph-theoretic properties of word-embedding networks. The networks are made of words children learn prior to the age of 30 months as the nodes and the links which are built from the cosine vector similarity of words normatively acquired by children prior to two and a half years of age. These networks are weighted graphs and the strength of each link is determined by the numerical similarities of the two words (nodes) on the sides of the link. To avoid changing the weighted networks to the binaries by setting a threshold, constructing MSTs might present a solution. MST is a unique sub-graph that connects all the nodes in such a way that the sum of all the link weights is maximized without forming cycles. MSTs as the backbone of the semantic networks are suitable to examine developmental changes in semantic network topology in children. From these trees, several parameters were calculated to characterize the developmental change in network organization. We showed that MSTs provides an elegant method sensitive to capture subtle developmental changes in semantic network organization.

Keywords: maximum spanning trees, word-embedding, semantic networks, language development

Procedia PDF Downloads 111

2513 Modeling Average Paths Traveled by Ferry Vessels Using AIS Data

Authors: Devin Simmons

Abstract:

At the USDOT’s Bureau of Transportation Statistics, a biannual census of ferry operators in the U.S. is conducted, with results such as route mileage used to determine federal funding levels for operators. AIS data allows for the possibility of using GIS software and geographical methods to confirm operator-reported mileage for individual ferry routes. As part of the USDOT’s work on the ferry census, an algorithm was developed that uses AIS data for ferry vessels in conjunction with known ferry terminal locations to model the average route travelled for use as both a cartographic product and confirmation of operator-reported mileage. AIS data from each vessel is first analyzed to determine individual journeys based on the vessel’s velocity, and changes in velocity over time. These trips are then converted to geographic linestring objects. Using the terminal locations, the algorithm then determines whether the trip represented a known ferry route. Given a large enough dataset, routes will be represented by multiple trip linestrings, which are then filtered by DBSCAN spatial clustering to remove outliers. Finally, these remaining trips are ready to be averaged into one route. The algorithm interpolates the point on each trip linestring that represents the start point. From these start points, a centroid is calculated, and the first point of the average route is determined. Each trip is interpolated again to find the point that represents one percent of the journey’s completion, and the centroid of those points is used as the next point in the average route, and so on until 100 points have been calculated. Routes created using this algorithm have shown demonstrable improvement over previous methods, which included the implementation of a LOESS model. Additionally, the algorithm greatly reduces the amount of manual digitizing needed to visualize ferry activity.

Keywords: ferry vessels, transportation, modeling, AIS data

Procedia PDF Downloads 137

2512 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 416

2511 From Script to Film: The Fading Voice of the Screenwriter

Authors: Ana Sofia Torres Pereira

Abstract:

On January 15th 2015, Peter Bart, editor in chief of Variety Magazine, published an article in the aforementioned magazine posing the following question “Are screenwriters becoming obsolete in Hollywood?” Is Hollywood loosing its interest in well plotted, well written scripts crafted by professionals? That screenwriters have been undervalued, forgotten and left behind since the begging of film, is a well-known fact, but ate they now at the brink of extinction? If fiction films are about people, stories, so, simply put, all about the script, what does it mean to say that the screenwriter is becoming obsolete? What will be the consequences of the possible death of the screenwriter for the cinema world? All of these questions lead us to an ultimate one: What is the true importance of a screenwriter? What can a screenwriter do that a director, for instance, can’t? How should a script be written and read in order not to become obsolete? And what about those countries, like Portugal, for example, in which the figure of the screenwriter is yet to be heard and known? How can screenwriters find their voice in a world driven by the tyrannical voice of the Director? In a demanding cinema world where the Director is considered the author of a film, it’s important to know where we can find the voice of the screenwriter, the true language of the screenplay and the importance this voice and specific language might have for the future of story telling and of film. In a paper that admittedly poses more questions than answers, I will try to unveil the importance a screenplay might have in Hollywood, in Portugal and in the cinema and communication world in general.

Keywords: cinema, communication, director, language, screenplay, screenwriting, story

Procedia PDF Downloads 292