Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 14279

Search results for: natural language understanding

14249 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 355

14248 From User's Requirements to UML Class Diagram

Authors: Zeineb Ben Azzouz, Wahiba Ben Abdessalem Karaa

Abstract:

The automated extraction of UML class diagram from natural language requirements is a highly challenging task. Many approaches, frameworks and tools have been presented in this field. Nonetheless, the experiments of these tools have shown that there is no approach that can work best all the time. In this context, we propose a new accurate approach to facilitate the automatic mapping from textual requirements to UML class diagram. Our new approach integrates the best properties of statistical Natural Language Processing (NLP) techniques to reduce ambiguity when analysing natural language requirements text. In addition, our approach follows the best practices defined by conceptual modelling experts to determine some patterns indispensable for the extraction of basic elements and concepts of the class diagram. Once the relevant information of class diagram is captured, a XMI document is generated and imported with a CASE tool to build the corresponding UML class diagram.

Keywords: class diagram, user’s requirements, XMI, software engineering

Procedia PDF Downloads 444

14247 Efficiency of a Semantic Approach in Teaching Foreign Languages

Authors: Genady Shlomper

Abstract:

During the process of language teaching, each teacher faces some general and some specific problems. Some of these problems are mutual to all languages because they yield to the rules of cognition, conscience, perception, understanding and memory; to the physiological and psychological principles pertaining to the human race irrespective of origin and nationality. Still, every language is a distinctive system, possessing individual properties and an obvious identity, as a result of a development in specific natural, geographical, cultural and historical conditions. The individual properties emerge in the script, in the phonetics, morphology and syntax. All these problems can and should be a subject of a detailed research and scientific analysis, mainly from practical considerations and language teaching requirements. There are some formidable obstacles in the language acquisition process. Among the first to be mentioned is the existence of concepts and entire categories in foreign languages, which are absent in the language of the students. Such phenomena reflect specific ways of thinking and the world-outlook, which were shaped during the evolution. Hindi is the national language of India, which belongs to the group of Indo-Iranian languages from the Indo-European family of languages. The lecturer has gained experience in teaching Hindi language to native speakers of Uzbek, Russian and Hebrew languages. He will show the difficulties in the field of phonetics, morphology and syntax, which the students have to deal with during the acquisition of the language. In the proposed lecture the lecturer will share his experience in making the process of language teaching more efficient by using non-formal semantic approach.

Keywords: applied linguistics, foreign language teaching, language teaching methodology, semantics

Procedia PDF Downloads 329

14246 Technology Enriched Classroom for Intercultural Competence Building through Films

Authors: Tamara Matevosyan

Abstract:

In this globalized world, intercultural communication is becoming essential for understanding communication among people, for developing understanding of cultures, to appreciate the opportunities and challenges that each culture presents to people. Moreover, it plays an important role in developing an ideal personification to understand different behaviors in different cultures. Native speakers assimilate sociolinguistic knowledge in natural conditions, while it is a great problem for language learners, and in this context feature films reveal cultural peculiarities and involve students in real communication. As we know nowadays the key role of language learning is the development of intercultural competence as communicating with someone from a different cultural background can be exciting and scary, frustrating and enlightening. Intercultural competence is important in FL learning classroom and here feature films can perform as essential tools to develop this competence and overcome the intercultural gap that foreign students face. Current proposal attempts to reveal the correlation of the given culture and language through feature films. To ensure qualified, well-organized and practical classes on Intercultural Communication for language learners a number of methods connected with movie watching have been implemented. All the pre-watching, while watching and post-watching methods and techniques are aimed at developing students’ communicative competence. The application of such activities as Climax, Role-play, Interactive Language, Daily Life helps to reveal and overcome mistakes of cultural and pragmatic character. All the above-mentioned activities are directed at the assimilation of the language vocabulary with special reference to the given culture. The study dwells into the essence of culture as one of the core concepts of intercultural communication. Sometimes culture is not a priority in the process of language learning which leads to further misunderstandings in real life communication. The application of various methods and techniques with feature films aims at developing students’ cultural competence, their understanding of norms and values of individual cultures. Thus, feature film activities will enable learners to enlarge their knowledge of the particular culture and develop a fundamental insight into intercultural communication.

Keywords: climax, intercultural competence, interactive language, role-play

Procedia PDF Downloads 315

14245 Gesture in the Arabic and Malay Languages a Comparative Study

Authors: Siti Sara binti Hj Ahmad, Adil Elshiekh Abdalla

Abstract:

The Arabic and Malay languages belong to different language’s families; while the Arabic language descends from the Semitic language, Malay belongs to the Austronesian (Malayo-Polynesian) family. Hence, the grammatical systems of the two languages differ from each other. Arabic, being a language found in the heart of the dessert, and Malay is the language found in the heart of thick equatorial forests, is another source of vital cultural differences. Consequently, it is expected that this situation will create differences in the ways of how speakers of the two languages perceive the world around them, convey and understand their messages. On the other hand, as the majority of the speakers of Malay language are Muslims, Arabic language found its way in this region; currently, Arabic is widely taught in school, some terms of it found their way in the Malay language. Accordingly, the Arabic language and culture have widely penetrated into the Malay language. This study is proposed with the aim to find out the differences and similarities between the two languages, in the term of the nonverbal communication. The result of this study will be of high significance, as it will help in enhancing the mutual understanding between the speakers of these languages. The comparative analysis approach will be utilized in this study.

Keywords: gesture, Arabic language, Malay language, comparative analysis

Procedia PDF Downloads 536

14244 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 71

14243 User Guidance for Effective Query Interpretation in Natural Language Interfaces to Ontologies

Authors: Aliyu Isah Agaie, Masrah Azrifah Azmi Murad, Nurfadhlina Mohd Sharef, Aida Mustapha

Abstract:

Natural Language Interfaces typically support a restricted language and also have scopes and limitations that naïve users are unaware of, resulting in errors when the users attempt to retrieve information from ontologies. To overcome this challenge, an auto-suggest feature is introduced into the querying process where users are guided through the querying process using interactive query construction system. Guiding users to formulate their queries, while providing them with an unconstrained (or almost unconstrained) way to query the ontology results in better interpretation of the query and ultimately lead to an effective search. The approach described in this paper is unobtrusive and subtly guides the users, so that they have a choice of either selecting from the suggestion list or typing in full. The user is not coerced into accepting system suggestions and can express himself using fragments or full sentences.

Keywords: auto-suggest, expressiveness, habitability, natural language interface, query interpretation, user guidance

Procedia PDF Downloads 453

14242 Water Absorption Studies on Natural Fiber Reinforced Polymer Composites

Authors: G. L. Devnani, Shishir Sinha

Abstract:

In the recent years, researchers have drawn their focus on natural fibers reinforced composite materials because of their excellent properties like low cost, lower weight, better tensile and flexural strengths, biodegradability etc. There is little concern however that when these materials are put in moist conditions for long duration, their mechanical properties degrade. Therefore, in order to take maximum advantage of these novel materials, one should have a complete understanding of their moisture or water absorption phenomena. Various fiber surface treatment methods like alkaline treatment, acetylation etc. have also been suggested for reduction in water absorption of these composites. In the present study, a detailed review is done for water absorption behavior of natural fiber reinforced polymer composites, and experiments also have been performed on these composites with varying the parameters like fiber loading etc. for understanding the water absorption kinetics. Various surface treatment methods also performed to reduce the water absorption behavior of these materials and effort is made to develop a proper understanding of water absorption mechanism mathematically and experimentally for full potential utilization of natural fiber reinforced polymer composite materials.

Keywords: alkaline treatment, composites, natural fiber, water absorption

Procedia PDF Downloads 245

14241 Transportation Language Register as One of Language Community

Authors: Diyah Atiek Mustikawati

Abstract:

Language register refers to a variety of a language used for particular purpose or in a particular social setting. Language register also means as a concept of adapting one’s use of language to conform to standards or tradition in a given professional or social situation. This descriptive study tends to discuss about the form of language register in transportation aspect, factors, also the function of use it. Mostly, language register in transportation aspect uses short sentences in form of informal register. The factor caused language register used are speaker, word choice, background of language. The functions of language register in transportations aspect are to make communication between crew easily, also to keep safety when they were in bad condition. Transportation language register developed naturally as one of variety of language used.

Keywords: language register, language variety, communication, transportation

Procedia PDF Downloads 444

14240 Linguistic Analysis of Borderline Personality Disorder: Using Language to Predict Maladaptive Thoughts and Behaviours

Authors: Charlotte Entwistle, Ryan Boyd

Abstract:

Recent developments in information retrieval techniques and natural language processing have allowed for greater exploration of psychological and social processes. Linguistic analysis methods for understanding behaviour have provided useful insights within the field of mental health. One area within mental health that has received little attention though, is borderline personality disorder (BPD). BPD is a common mental health disorder characterised by instability of interpersonal relationships, self-image and affect. It also manifests through maladaptive behaviours, such as impulsivity and self-harm. Examination of language patterns associated with BPD could allow for a greater understanding of the disorder and its links to maladaptive thoughts and behaviours. Language analysis methods could also be used in a predictive way, such as by identifying indicators of BPD or predicting maladaptive thoughts, emotions and behaviours. Additionally, associations that are uncovered between language and maladaptive thoughts and behaviours could then be applied at a more general level. This study explores linguistic characteristics of BPD, and their links to maladaptive thoughts and behaviours, through the analysis of social media data. Data were collected from a large corpus of posts from the publicly available social media platform Reddit, namely, from the ‘r/BPD’ subreddit whereby people identify as having BPD. Data were collected using the Python Reddit API Wrapper and included all users which had posted within the BPD subreddit. All posts were manually inspected to ensure that they were not posted by someone who clearly did not have BPD, such as people posting about a loved one with BPD. These users were then tracked across all other subreddits of which they had posted in and data from these subreddits were also collected. Additionally, data were collected from a random control group of Reddit users. Disorder-relevant behaviours, such as self-harming or aggression-related behaviours, outlined within Reddit posts were coded to by expert raters. All posts and comments were aggregated by user and split by subreddit. Language data were then analysed using the Linguistic Inquiry and Word Count (LIWC) 2015 software. LIWC is a text analysis program that identifies and categorises words based on linguistic and paralinguistic dimensions, psychological constructs and personal concern categories. Statistical analyses of linguistic features could then be conducted. Findings revealed distinct linguistic features associated with BPD, based on Reddit posts, which differentiated these users from a control group. Language patterns were also found to be associated with the occurrence of maladaptive thoughts and behaviours. Thus, this study demonstrates that there are indeed linguistic markers of BPD present on social media. It also implies that language could be predictive of maladaptive thoughts and behaviours associated with BPD. These findings are of importance as they suggest potential for clinical interventions to be provided based on the language of people with BPD to try to reduce the likelihood of maladaptive thoughts and behaviours occurring. For example, by social media tracking or engaging people with BPD in expressive writing therapy. Overall, this study has provided a greater understanding of the disorder and how it manifests through language and behaviour.

Keywords: behaviour analysis, borderline personality disorder, natural language processing, social media data

Procedia PDF Downloads 304

14239 Learner's Difficulties Acquiring English: The Case of Native Speakers of Rio de La Plata Spanish Towards Justifying the Need for Corpora

Authors: Maria Zinnia Bardas Hoffmann

Abstract:

Contrastive Analysis (CA) is the systematic comparison between two languages. It stems from the notion that errors are caused by interference of the L1 system in the acquisition process of an L2. CA represents a useful tool to understand the nature of learning and acquisition. Also, this particular method promises a path to un-derstand the nature of underlying cognitive processes, even when other factors such as intrinsic motivation and teaching strategies were found to best explain student’s problems in acquisition. CA study is justified not only from the need to get a deeper understanding of the nature of SLA, but as an invaluable source to provide clues, at a cognitive level, for those general processes involved in rule formation and abstract thought. It is relevant for cross disciplinary studies and the fields of Computational Thought, Natural Language processing, Applied Linguistics, Cognitive Linguistics and Math Theory. That being said, this paper intends to address here as well its own set of constraints and limitations. Finally, this paper: (a) aims at identifying some of the difficulties students may find in their learning process due to the nature of their specific variety of L1, Rio de la Plata Spanish (RPS), (b) represents an attempt to discuss the necessity for specific models to approach CA.

Keywords: second language acquisition, applied linguistics, contrastive analysis, applied contrastive analysis English language department, meta-linguistic rules, cross-linguistics studies, computational thought, natural language processing

Procedia PDF Downloads 114

14238 The Impact of Content Familiarity of Receptive Skills on Language Learning

Authors: Sara Fallahi

Abstract:

This paper reviews the importance of content familiarity of receptive skills and offers solutions to the issue of content unfamiliarity in language learning materials. Presently, language learning materials are mainly comprised of global issues and target language speakers’ culture(s) in receptive skills. This might leadlearners to focus on content rather than the language. As a solution, materials on receptive skills can be developed with a focus on learners’culture and social concerns, especially in the beginner levels of learning. Language learners often learn their target language through the receptive skills of listening and reading before language production ensues through speaking and writing. Students’ journey from receptive skills to productive skills is mainly concentrated on by teachers. There are barriers to language learning, such as time and energy, that can hinder learners’ understanding and ability to build the required background knowledge of the content. This is generated due to learners’ unfamiliarity with the skill’s content. Therefore, materials that improve content familiarity will help learners improve their language comprehension, learning, and usage. This presentation will conclude with practical solutions to help teachers and learners more authentically integrate language and culture to elevate language learning.

Keywords: language learning, listening content, reading content, content familiarity, ESL books, language learning books, cultural familiarity

Procedia PDF Downloads 80

14237 An Ethnographic Inquiry: Exploring the Saudi Students’ Motivation to Learn English Language

Authors: Musa Alghamdi

Abstract:

Although Saudi students’ motivation to learn English language as a foreign language in Saudi Arabia have been investigated by a number of studies; these have appeared almost completely as using the quantitative research paradigm. There is a significant lack of research that explores the Saudi students’ motivation using qualitative methods. It was essential, as an investigator, to be immersed in the community to understand the individuals under study via their actions and words, their thoughts, views and beliefs, and how those individuals credited to activities. Thus, the study aims to explore the Saudi students’ motivation to learn English language as a foreign language in Saudi Arabia employing qualitative methodology via applying ethnography. The study will be carried out in Saudi Arabia. Ethnography qualitative approach will be used in the current study by employing formal and informal interview instruments. Gardner’s motivation theory is used as frameworks for this study to aid the understanding of the research findings. The author, an English language lecturer, will undertake participant observations for 4 months. He will work as teaching-assistant (on an unpaid basis) with EFL lecturers in different discipline department at a Saudi university where students study English language as a minor course. The researcher will start with informal ethnographical interview with students during his existence with the informants in their natural context. Then the researcher will utilize the semi-structural interview. The informal interview will be with 14-16 students, then, he will carry out semi-structural interview with the same informants to go deep in their natural context to find out to what extent the Saudi university students are motivated to learn English as a foreign language. As well as, to find out the reasons that played roles in that. The findings of this study will add new knowledge about what factors motivate universities’ Saudi students to learn English language in Saudi Arabia. Very few chances have given to students to express themselves and to speak about their feelings in a more comfortable way in order to gain a clear image of those factors. The working author as an EFL teacher and lecturer will provide him secure access into EFL teaching and learning setting. It will help him attain richer insights into the nature EFL context in universities what will provide him with richer insights into the reasons behind the weakness of EFL level among Saudi students.

Keywords: motivation, ethnography, Saudi, language

Procedia PDF Downloads 273

14236 Learning to Translate by Learning to Communicate to an Entailment Classifier

Authors: Szymon Rutkowski, Tomasz Korbak

Abstract:

We present a reinforcement-learning-based method of training neural machine translation models without parallel corpora. The standard encoder-decoder approach to machine translation suffers from two problems we aim to address. First, it needs parallel corpora, which are scarce, especially for low-resource languages. Second, it lacks psychological plausibility of learning procedure: learning a foreign language is about learning to communicate useful information, not merely learning to transduce from one language’s 'encoding' to another. We instead pose the problem of learning to translate as learning a policy in a communication game between two agents: the translator and the classifier. The classifier is trained beforehand on a natural language inference task (determining the entailment relation between a premise and a hypothesis) in the target language. The translator produces a sequence of actions that correspond to generating translations of both the hypothesis and premise, which are then passed to the classifier. The translator is rewarded for classifier’s performance on determining entailment between sentences translated by the translator to disciple’s native language. Translator’s performance thus reflects its ability to communicate useful information to the classifier. In effect, we train a machine translation model without the need for parallel corpora altogether. While similar reinforcement learning formulations for zero-shot translation were proposed before, there is a number of improvements we introduce. While prior research aimed at grounding the translation task in the physical world by evaluating agents on an image captioning task, we found that using a linguistic task is more sample-efficient. Natural language inference (also known as recognizing textual entailment) captures semantic properties of sentence pairs that are poorly correlated with semantic similarity, thus enforcing basic understanding of the role played by compositionality. It has been shown that models trained recognizing textual entailment produce high-quality general-purpose sentence embeddings transferrable to other tasks. We use stanford natural language inference (SNLI) dataset as well as its analogous datasets for French (XNLI) and Polish (CDSCorpus). Textual entailment corpora can be obtained relatively easily for any language, which makes our approach more extensible to low-resource languages than traditional approaches based on parallel corpora. We evaluated a number of reinforcement learning algorithms (including policy gradients and actor-critic) to solve the problem of translator’s policy optimization and found that our attempts yield some promising improvements over previous approaches to reinforcement-learning based zero-shot machine translation.

Keywords: agent-based language learning, low-resource translation, natural language inference, neural machine translation, reinforcement learning

Procedia PDF Downloads 99

14235 Greek Teachers' Understandings of Typical Language Development and of Language Difficulties in Primary School Children and Their Approaches to Language Teaching

Authors: Konstantina Georgali

Abstract:

The present study explores Greek teachers’ understandings of typical language development and of language difficulties. Its core aim was to highlight that teachers need to have a thorough understanding of educational linguistics, that is of how language figures in education. They should also be aware of how language should be taught so as to promote language development for all students while at the same time support the needs of children with language difficulties in an inclusive ethos. The study, thus argued that language can be a dynamic learning mechanism in the minds of all children and a powerful teaching tool in the hands of teachers and provided current research evidence to show that structural and morphological particularities of native languages- in this case, of the Greek language- can be used by teachers to enhance children’s understanding of language and simultaneously improve oral language skills for children with typical language development and for those with language difficulties. The research was based on a Sequential Exploratory Mixed Methods Design deployed in three consecutive and integrative phases. The first phase involved 18 exploratory interviews with teachers. Its findings informed the second phase involving a questionnaire survey with 119 respondents. Contradictory questionnaire results were further investigated in a third phase employing a formal testing procedure with 60 children attending Y1, Y2 and Y3 of primary school (a research group of 30 language impaired children and a comparison group of 30 children with typical language development, both identified by their class teachers). Results showed both strengths and weaknesses in teachers’ awareness of educational linguistics and of language difficulties. They also provided a different perspective of children’s language needs and of language teaching approaches that reflected current advances and conceptualizations of language problems and opened a new window on how best they can be met in an inclusive ethos. However, teachers barely used teaching approaches that could capitalize on the particularities of the Greek language to improve language skills for all students in class. Although they seemed to realize the importance of oral language skills and their knowledge base on language related issues was adequate, their practices indicated that they did not see language as a dynamic teaching and learning mechanism that can promote children’s language development and in tandem, improve academic attainment. Important educational implications arose and clear indications of the generalization of findings beyond the Greek educational context.

Keywords: educational linguistics, inclusive ethos, language difficulties, typical language development

Procedia PDF Downloads 354

14234 Generating Insights from Data Using a Hybrid Approach

Authors: Allmin Susaiyah, Aki Härmä, Milan Petković

Abstract:

Automatic generation of insights from data using insight mining systems (IMS) is useful in many applications, such as personal health tracking, patient monitoring, and business process management. Existing IMS face challenges in controlling insight extraction, scaling to large databases, and generalising to unseen domains. In this work, we propose a hybrid approach consisting of rule-based and neural components for generating insights from data while overcoming the aforementioned challenges. Firstly, a rule-based data 2CNL component is used to extract statistically significant insights from data and represent them in a controlled natural language (CNL). Secondly, a BERTSum-based CNL2NL component is used to convert these CNLs into natural language texts. We improve the model using task-specific and domain-specific fine-tuning. Our approach has been evaluated using statistical techniques and standard evaluation metrics. We overcame the aforementioned challenges and observed significant improvement with domain-specific fine-tuning.

Keywords: data mining, insight mining, natural language generation, pre-trained language models

Procedia PDF Downloads 77

14233 From the “Movement Language” to Communication Language

Authors: Mahmudjon Kuchkarov, Marufjon Kuchkarov

Abstract:

The origin of ‘Human Language’ is still a secret and the most interesting subject of historical linguistics. The core element is the nature of labeling or coding the things or processes with symbols and sounds. In this paper, we investigate human’s involuntary Paired Sounds and Shape Production (PSSP) and its contribution to the development of early human communication. Aimed at twenty-six volunteers who provided many physical movements with various difficulties, the research team investigated the natural, repeatable, and paired sounds and shape productions during human activities. The paper claims the involvement of Paired Sounds and Shape Production (PSSP) in the phonetic origin of some modern words and the existence of similarities between elements of PSSP with characters of the classic Latin alphabet. The results may be used not only as a supporting idea for existing theories but to create a closer look at some fundamental nature of the origin of the languages as well.

Keywords: body shape, body language, coding, Latin alphabet, merging method, movement language, movement sound, natural sound, origin of language, pairing, phonetics, sound and shape production, word origin, word semantic

Procedia PDF Downloads 162

14232 Language Teachers Exercising Agency Amid Educational Constraints: An Overview of the Literature

Authors: Anna Sanczyk

Abstract:

Teacher agency plays a crucial role in effective teaching, supporting diverse students, and providing an enriching learning environment; therefore, it is significant to gain a deeper understanding of language teachers’ sense of agency in teaching linguistically and culturally diverse students. This paper presents an overview of qualitative research on how language teachers exercise their agency in diverse classrooms. The analysis of the literature reveals that language teachers strive for addressing students’ needs and challenging educational inequalities, but experience educational constraints in enacting their agency. The examination of the research on language teacher agency identifies four major areas where language teachers experience challenges in enacting their agency: (1) implementing curriculum; (2) adopting school reforms and policies; (3) engaging in professional learning; (4) and negotiating various identities as professionals. The practical contribution of this literature review is that it provides a much-needed compilation of the studies on how language teachers exercise agency amid educational constraints. The discussion of the overview points to the importance of teacher identity, learner advocacy, and continuous professional learning and the critical need of promoting empowerment, activism, and transformation in language teacher education. The findings of the overview indicate that language teacher education programs should prepare teachers to be active advocates for English language learners and guide teachers to become more conscious of complexities of teaching in constrained educational settings so that they can become agentic professionals. This literature overview illustrates agency work in English language teaching contexts and contributes to understanding of the important link between experiencing educational constraints and development of teacher agency.

Keywords: advocacy, educational constraints, language teacher agency, language teacher education

Procedia PDF Downloads 142

14231 Peace through Language Policy as a Solution to the Ethnic Conflict in Sri Lanka

Authors: R. M. W. Rajapakshe

Abstract:

Sri Lanka, which is officially called the Democratic Socialist Republic of Sri Lanka is an island nation situated near India. It is a multi-lingual, multi- religious and multi – ethnic country, where Sinhalese form the majority and the Tamils form the largest ethnic minority. The composition of the population (ethnic basis) in Sri Lanka is as follows: Sinhalese: 74.5%, Tamil (Sri Lankan): 12.6%, Muslim: 7.5 %, Tamil (Indian): 5.5%, Malay: 0.3%, Burgher: 0.3 %, other: 0.2 %. The Tamil people use the Tamil language as their mother tongue and the Sinhala people use the Sinhala language as their mother tongue. A very few people in both communities use English as their mother tongue and however, a large number of people use English as a second language. The Sinhala Language was declared the only official language in Sri Lanka in 1959. However, it was not acceptable to Tamil politicians as well as to the common Tamil people and it was the beginning of long standing ethnic crisis which later became a military war where a lot of blood was shed. As a solution to the above ethnic crisis the thirteenth amendment to the constitution of Sri Lanka was introduced in 1987 and according to it both Sinhala and Tamil were declared official languages and English as the link language in Sri Lanka. Thus, a new programme namely, second language teaching programme under which Sinhala was taught to Tamil students and Tamil was taught to Sinhala students, was introduced at government schools. Language teaching includes knowledge of the culture of the target language. As all cultures are mixed and have common features students have reduced their enmity about the other community and learned to respect the other culture. On the other hand as all languages are mixed, students came to the understanding that there are no pure languages. Thus, they learned to respect the other language. In the case of Sri Lanka the Sinhala language is mixed with the Tamil language and vice versa. Thus, the development of second language teaching is the prominent way to solve the above ethnic problem and this study clearly shows it. However, the above programme suffers with lack of trained second language teachers, infrastructure facilities and insufficient funds and, they can be considered as the main obstacles to develop the second language teaching programme. Yet, there are no satisfactory answers to those problems. The data were collected from relevant books, articles and other documents based on research and forty five recordings, each with one hour duration, of natural conversations covering all factions of the Sinhala community.

Keywords: ethnic crisis, official language, second language teaching, Sinhala, Tami

Procedia PDF Downloads 322

14230 Learning Grammars for Detection of Disaster-Related Micro Events

Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev

Abstract:

Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.

Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter

Procedia PDF Downloads 456

14229 Natural Language News Generation from Big Data

Authors: Bastian Haarmann, Likas Sikorski

Abstract:

In this paper, we introduce an NLG application for the automatic creation of ready-to-publish texts from big data. The fully automatic generated stories have a high resemblance to the style in which the human writer would draw up a news story. Topics may include soccer games, stock exchange market reports, weather forecasts and many more. The generation of the texts runs according to the human language production. Each generated text is unique. Ready-to-publish stories written by a computer application can help humans to quickly grasp the outcomes of big data analyses, save time-consuming pre-formulations for journalists and cater to rather small audiences by offering stories that would otherwise not exist.

Keywords: big data, natural language generation, publishing, robotic journalism

Procedia PDF Downloads 405

14228 Genomic Sequence Representation Learning: An Analysis of K-Mer Vector Embedding Dimensionality

Authors: James Jr. Mashiyane, Risuna Nkolele, Stephanie J. Müller, Gciniwe S. Dlamini, Rebone L. Meraba, Darlington S. Mapiye

Abstract:

When performing language tasks in natural language processing (NLP), the dimensionality of word embeddings is chosen either ad-hoc or is calculated by optimizing the Pairwise Inner Product (PIP) loss. The PIP loss is a metric that measures the dissimilarity between word embeddings, and it is obtained through matrix perturbation theory by utilizing the unitary invariance of word embeddings. Unlike in natural language, in genomics, especially in genome sequence processing, unlike in natural language processing, there is no notion of a “word,” but rather, there are sequence substrings of length k called k-mers. K-mers sizes matter, and they vary depending on the goal of the task at hand. The dimensionality of word embeddings in NLP has been studied using the matrix perturbation theory and the PIP loss. In this paper, the sufficiency and reliability of applying word-embedding algorithms to various genomic sequence datasets are investigated to understand the relationship between the k-mer size and their embedding dimension. This is completed by studying the scaling capability of three embedding algorithms, namely Latent Semantic analysis (LSA), Word2Vec, and Global Vectors (GloVe), with respect to the k-mer size. Utilising the PIP loss as a metric to train embeddings on different datasets, we also show that Word2Vec outperforms LSA and GloVe in accurate computing embeddings as both the k-mer size and vocabulary increase. Finally, the shortcomings of natural language processing embedding algorithms in performing genomic tasks are discussed.

Keywords: word embeddings, k-mer embedding, dimensionality reduction

Procedia PDF Downloads 100

14227 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers

Authors: Yogendra Sisodia

Abstract:

Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.

Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity

Procedia PDF Downloads 70

14226 Research on the Risks of Railroad Receiving and Dispatching Trains Operators: Natural Language Processing Risk Text Mining

Authors: Yangze Lan, Ruihua Xv, Feng Zhou, Yijia Shan, Longhao Zhang, Qinghui Xv

Abstract:

Receiving and dispatching trains is an important part of railroad organization, and the risky evaluation of operating personnel is still reflected by scores, lacking further excavation of wrong answers and operating accidents. With natural language processing (NLP) technology, this study extracts the keywords and key phrases of 40 relevant risk events about receiving and dispatching trains and reclassifies the risk events into 8 categories, such as train approach and signal risks, dispatching command risks, and so on. Based on the historical risk data of personnel, the K-Means clustering method is used to classify the risk level of personnel. The result indicates that the high-risk operating personnel need to strengthen the training of train receiving and dispatching operations towards essential trains and abnormal situations.

Keywords: receiving and dispatching trains, natural language processing, risk evaluation, K-means clustering

Procedia PDF Downloads 44

14225 Literature, Culture, and Shakespeare's Dramatization of Linguistic Scenes

Authors: Cheang Wai Fong

Abstract:

This paper takes language and its interconnection with power as a point of departure to analyze some linguistic scenes played up by William Shakespeare. By placing language into the big picture of literature and culture, and by reexamining the etymological relations between the three terms, language, literature and culture, the paper attempts to formulate an understanding of their more expansive meanings. It compares their respective traditional notions with their modern concepts brought up by literary critics, anthropologists and sociolinguists. Then it uses these expansive meanings to reinterpret Shakespeare’s linguistic scenes featuring language contentions, and to discuss Shakespeare’s success as a signification of literature’s role within the linguistic and cultural context of Elizabethan England.

Keywords: culture, language, literature, shakespeare

Procedia PDF Downloads 506

14224 Linguistic Landscape as a Bottom-up Approach: Investigation of Semiotic Features and Language Use in the Catering Industry in Hong Kong

Authors: Tsz Ching Jasmine Lam

Abstract:

Linguistic landscape (LL) can serve as both top-down and bottom-up approaches to understanding language planning policy in various dimensions. It can reflect the language identities, motives and contestations perceived by stakeholders of different decision-making levels. Prior studies adopted the bottom-up approach to investigate the language practice and ideologies reflected by the design and linguistic features observed in the linguistic landscapes in ethnically and linguistically diverse areas, like Medan in Russia and Seoul in Korea. As Hong Kong is also a trilingual city with an inclusive combination of nationalities, this paper is intended to take it as a case study to explore the de facto language ideologies reflected by LL at the micro-level. We would look into the catering industry from a holistic perspective by reviewing the food menus of 66 restaurants located in diversified districts and serving different types of cuisines. This bottom-up LL research reveals that business owners and the public share the language ideologies of perceiving English as a prestigious language, multilingualism and traditional Chinese as a standard character.

Keywords: bottom-up, language ideologies, language planning policy, language policy, language identities, linguistic landscape

Procedia PDF Downloads 46

14223 Deaf Inmates in Canadian Prisons: Addressing Discrimination through Staff Training Videos with Deaf Actors

Authors: Tracey Bone

Abstract:

Deaf inmates, whose first or preferred language is a Signed Language, experience barriers to accessing the necessary two-way communication with correctional staff, and the educational and social programs that will enhance their eligibility for conditional release from the federal prison system in Canada. The development of visual content to enhance the knowledge and skill development of correctional staff is a contemporary strategy intended to significantly improve the correctional experience for deaf inmates. This presentation reports on the development of two distinct training videos created to enhance staff’s understanding of the needs of deaf inmates; one a two-part simulation of an interaction with a deaf inmate, the second an interview with a deaf academic. Part one of video one demonstrates the challenges and misunderstandings inherent in communicating across languages without a qualified sign language interpreter; the second part demonstrates the ease of communication when communication needs are met. Video two incorporates the experiences of a deaf academic to provide the cultural grounding necessary to educate staff in the unique experiences associated with being a visual language user. Lack of staff understanding or awareness of deaf culture and language must not be acceptable reasons for the inadequate treatment of deaf visual language users in federal prisons. This paper demonstrates a contemporary approach to meeting the human rights and needs of this unique and often ignored inmate subpopulation. The deaf community supports this visual approach to enhancing staff understanding of the unique needs of this population. A study of its effectiveness is currently underway.

Keywords: accommodations, American Sign Language (ASL), deaf inmates, sensory deprivation

Procedia PDF Downloads 127

14222 A Clear Language Is Essential: A Qualitative Exploration of Doctor-Patient Health Interaction in Jordan

Authors: Etaf Khlaed Haroun Alkhlaifat

Abstract:

When doctors and patients do not share the same first language, language barriers may exist, which may have negative effects on the quality of communication and care provided. Doctors’ use of medical jargon and patients’ inability to fully express their illness, to a potential loss of relevant information can often create misunderstanding. This study sought to examine the extent to which a lack of “common” language represents one of the linguistic obstacles that may adversely influence the quality of healthcare services in Jordan. Communication Accommodation Theory (CAT) was used to interpret the phenomena under study. Doctors (n=9) and patients (n=18) were observed and interviewed in natural Jordanian medical settings. A thematic qualitative approach was employed to analyse the data. The preliminary findings of the study revealed that most doctors appeared to have a good sense of appropriate ways to break through communication barriers by changing medical terminologies or jargons into lay terms. However, for some, there were two main challenges: 1) the use of medical jargon in explaining medication and side effects and 2) the lack of patients’ knowledge in providing a full explanation about their illnesses. The study revealed that language barriers adversely affect health outcomes for patients with limited fluency in the English language. It argues that it is doctors’ responsibility to guarantee mutual understanding, educate patients on their condition and improve their health outcomes.

Keywords: communication accommodation theory, doctor-patient interaction, language barrier, medical jargon, misunderstanding

Procedia PDF Downloads 53

14221 Arabic Language in Modern Era: Some Challenges

Authors: Tajudeen Yusuf

Abstract:

Arabic language and its instruction occupy a prominent status in the contemporary world, especially in academic and research institutions. Arabic, like other international languages, consolidates understanding among people of different nations and societies. It is a promising medium of sharing thoughts and feelings. As a means of communication and interaction, the language has gained its outstanding status since ancient times, especially because of the relationship it maintains with Islam and its heritage. Adding to its importance is the rapid growth and advancement of Science and Technology in the contemporary Era which has eventually made communication between human societies all over the world inevitable. Despite, the Arabic language still experiences many challenges especially in some area such as irrelevant textbooks and other teaching materials, old versions of teaching methods and inadequate teachers who professionally trained. Eventually, these have resulted in difficulties in the teaching and learning of the language. Therefore, urgent and necessary measures to enhance the teaching and learning of Arabic language within and outside Arab countries are therefore needed to be taken.

Keywords: Arabic, language, challenges, modern era

Procedia PDF Downloads 568

14220 Developing Kazakh Language Fluency Test in Nazarbayev University

Authors: Saule Mussabekova, Samal Abzhanova

Abstract:

The Kazakh Language Fluency Test, based on the IELTS exam, was implemented in 2012 at Nazarbayev University in Astana, Kazakhstan. We would like to share our experience in developing this exam and some exam results with other language instructors. In this paper, we will cover all these peculiarities and their related issues. The Kazakh Language Fluency Test is a young exam. During its development, we faced many difficulties. One of the goals of the university and the country is to encourage fluency in the Kazakh language for all citizens of the Republic. Nazarbayev University has introduced a Kazakh language program to assist in achieving this goal. This policy is one-step in ensuring that NU students have a thorough understanding of the Kazakh language through a fluency test based on the International English Language Testing System (IELTS). The Kazakh Language Fluency Test exam aims to determine student’s knowledge of Kazakh language. The fact is that there are three types of students at Nazarbayev University: Kazakh-speaking heritage learners, Russian-speaking and English-speaking students. Unfortunately, we have Kazakh students who do not speak Kazakh. All students who finished school with Russian language instruction are given Kazakh Language Fluency Test in order to determine their Kazakh level. After the test exam, all students can choose appropriate Kazakh course: Basic Kazakh, Intermediate Kazakh and Upper-Intermediate Kazakh. The Kazakh Language Fluency Test consists of four parts: Listening, Reading, Writing and Speaking. They are taken on the same day in the abovementioned order.

Keywords: diagnostic test, kazakh language, placement test, test result

Procedia PDF Downloads 376