Search results for: natural language grammar models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 14911

Search results for: natural language grammar models

14671 Towards an Indigenous Language Policy for National Integration

Authors: Odoh Dickson Akpegi

Abstract:

The paper is about the need for an indigenous language in order to meaningfully harness both our human and material resources for the nation’s integration. It then examines the notty issue of the national language question and advocates a piece meal approach in solving the problem. This approach allows for the development and use of local languages in minority areas, especially in Benue State, as a way of preparing them for consideration as possible replacement for English language as Nigeria’s national or official language. Finally, an arrangement to follow to prepare the languages for such competition at the national level is presented.

Keywords: indigenous language, English language, official language, National integration

Procedia PDF Downloads 550
14670 The Different Types of French Language in the Processes of Acquisition: Specifically about The Humor

Authors: Akbarnejad Neda

Abstract:

A foreign language acquisition occurs when we can tell a joke and understand it. Most jokes are told in slang and common language. In the process of foreign language acquisition, an autonomous learner try to learn the standard language. But there is a colossal divergence between the usage of the different types of language in society. Here, we investigate the french slang and common language and examine the accurate perception of their usage. We illuminate the slang language in the french literature that provide considerably different types of language for an autonomous learner. We provide furthermore evidence from the french novels that demonstrate properly the different types of language and give in one sentence its social meanings. For example, the famous Queneau expression « Doukipudonktant » present the impact of slang language in society. The characters in the novel transfer the slang and the common language and their accurate usages. We present that the language of the autonomous learner depends on the language of the text that is read. Because literature is a vehicle of the culture and the expression demonstrate their real significations and usage in the culture, slang and common language have a crucial role in the culture and all of them are manifested in the oral language.

Keywords: common language, french, humor, slang language

Procedia PDF Downloads 227
14669 Synthesis and Performance Adsorbent from Coconut Shells Polyetheretherketone for Natural Gas Storage

Authors: Umar Hayatu Sidik

Abstract:

The natural gas vehicle represents a cost-competitive, lower-emission alternative to the gasoline-fuelled vehicle. The immediate challenge that confronts natural gas is increasing its energy density. This paper addresses the question of energy density by reviewing the storage technologies for natural gas with improved adsorbent. Technical comparisons are made between storage systems containing adsorbent and conventional compressed natural gas based on the associated amount of moles contained with Compressed Natural Gas (CNG) and Adsorbed Natural Gas (ANG). We also compare gas storage in different cylinder types (1, 2, 3 and 4) based on weight factor and storage capacity. For the storage tank system, we discussed the concept of carbon adsorbents, when used in CNG tanks, offer a means of increasing onboard fuel storage and, thereby, increase the driving range of the vehicle. It confirms that the density of the stored gas in ANG is higher than that of compressed natural gas (CNG) operated at the same pressure. The obtained experimental data were correlated using linear regression analysis with common adsorption kinetic (Pseudo-first order and Pseudo-second order) and isotherm models (Sip and Toth). The pseudo-second-order kinetics describe the best fitness with a correlation coefficient of 9945 at 35 bar. For adsorption isotherms, the Sip model shows better fitness with the regression coefficient (R2) of 0.9982 and with the lowest RSMD value of 0.0148. The findings revealed the potential of adsorbent in natural gas storage applications.

Keywords: natural gas, adsorbent, compressed natural gas, adsorption

Procedia PDF Downloads 57
14668 The First Language of Humanity is Body Language Neither Mother or Native Language

Authors: Badriah Khaleel

Abstract:

Language acquisition is one of the most striking aspects of human development. It is a startling feat, which has engrossed the attention of linguists for generations. The present study will explore the hidden identities and attributes of nonverbal gestures. The current research will reflect the significant role of body language as not mere body gestures or facial expressions but as the first language of humanity.

Keywords: a startling feat, a new horizon for linguists to rethink, explore the hidden identities and attributes of non-verbal gestures, English as a third language, the first language of humanity

Procedia PDF Downloads 494
14667 How Unicode Glyphs Revolutionized the Way We Communicate

Authors: Levi Corallo

Abstract:

Typed language made by humans on computers and cell phones has made a significant distinction from previous modes of written language exchanges. While acronyms remain one of the most predominant markings of typed language, another and perhaps more recent revolution in the way humans communicate has been with the use of symbols or glyphs, primarily Emojis—globally introduced on the iPhone keyboard by Apple in 2008. This paper seeks to analyze the use of symbols in typed communication from both a linguistic and machine learning perspective. The Unicode system will be explored and methods of encoding will be juxtaposed with the current machine and human perception. Topics in how typed symbol usage exists in conversation will be explored as well as topics across current research methods dealing with Emojis like sentiment analysis, predictive text models, and so on. This study proposes that sequential analysis is a significant feature for analyzing unicode characters in a corpus with machine learning. Current models that are trying to learn or translate the meaning of Emojis should be starting to learn using bi- and tri-grams of Emoji, as well as observing the relationship between combinations of different Emoji in tandem. The sociolinguistics of an entire new vernacular of language referred to here as ‘typed language’ will also be delineated across my analysis with unicode glyphs from both a semantic and technical perspective.

Keywords: unicode, text symbols, emojis, glyphs, communication

Procedia PDF Downloads 190
14666 Implementing a Database from a Requirement Specification

Authors: M. Omer, D. Wilson

Abstract:

Creating a database scheme is essentially a manual process. From a requirement specification, the information contained within has to be analyzed and reduced into a set of tables, attributes and relationships. This is a time-consuming process that has to go through several stages before an acceptable database schema is achieved. The purpose of this paper is to implement a Natural Language Processing (NLP) based tool to produce a from a requirement specification. The Stanford CoreNLP version 3.3.1 and the Java programming were used to implement the proposed model. The outcome of this study indicates that the first draft of a relational database schema can be extracted from a requirement specification by using NLP tools and techniques with minimum user intervention. Therefore, this method is a step forward in finding a solution that requires little or no user intervention.

Keywords: information extraction, natural language processing, relation extraction

Procedia PDF Downloads 254
14665 The Role of Art and Music in Enriching Adult Learning in Maltese as a Second Language

Authors: Jacqueline Zammit

Abstract:

Currently, a considerable number of individuals from different backgrounds are being drawn to Malta due to its favourable environment for business, investment, and employment. This influx has led to a growing interest among expats in learning Maltese as a second language (ML2) to enrich their experience of working and residing in Malta. However, the intricacies of Maltese grammar, particularly challenging for second language (L2) learners unfamiliar with Arabic, can pose difficulties in the learning process. Furthermore, it's worth noting that the teaching of ML2 is an emerging field with limited existing research on effective pedagogical strategies. The realm of second language acquisition (SLA) can be notably demanding for adults, requiring well-founded interventions to facilitate learning. Among these interventions, approaches grounded in empirical evidence have incorporated artistic and musical elements to augment SLA. Both art and music have proven roles in facilitating L2 communication, aiding vocabulary retention, and improving comprehension skills. This study aims to delve into the utilization of music and art as catalysts for enhancing the progress of adult learners in mastering ML2. The research employs a qualitative methodology, employing a sample selected through convenience sampling, which encompassed 37 adult learners of ML2. These participants engaged in individual interviews. The data derived from these interviews were subjected to thorough analysis. The outcomes of the study underscore the substantial positive influence exerted by art and music on the academic advancement of adult ML2 learners. Notably, it emerged from the participants' accounts that the current ML2 curricula lack the integration of art and music. Therefore, this study advocates for the incorporation of art and music components within both traditional classroom settings and online ML2 courses. The intention is to bolster the academic accomplishments of adult learners in the realm of Maltese as a second language, bridging the current gap between theory and practice.

Keywords: academic accomplishment, mature learners, visual art, learning Maltese as a second language, musical involvement, acquiring a second language

Procedia PDF Downloads 71
14664 Pragmatic Competence in Pakistani English Language Learners

Authors: Ghazala Kausar

Abstract:

This study investigates Pakistani first year university students’ perception of the role of pragmatics in their general approach to learning English. The research is triggered by National Curriculum’s initiative to provide holistic opportunities to the students for language development and to equip them with competencies to use English language in academic and social contexts (New English National Curriculum for I-XII). The traditional grammar translation and examination oriented method is believed to reduce learners to silent listener (Zhang, 2008: Zhao 2009). This lead to the inability of the students to interpret discourse by relating utterances to their meaning, understanding the intentions of the users and how language is used in specific setting (Bachman & Palmer, 1996, 2010). Pragmatic competence is a neglected area as far as teaching and learning English in Pakistan is concerned. This study focuses on the different types of pragmatic knowledge, learners perception of such knowledge and learning strategies employed by different learners to process the learning in general and pragmatic in particular. This study employed three data collecting tools; a questionnaire, discourse completion task and interviews to elicit data from first year university students regarding their perception of pragmatic competence. Results showed that Pakistani first year university learners have limited pragmatic knowledge. Although they acknowledged the importance of linguistic knowledge for linguistic competence in the students but argued that insufficient English proficiency, limited knowledge of pragmatics, insufficient language material and tasks were major reasons of pragmatic failure.

Keywords: pragmatic competence, Pakistani college learners, linguistic competence

Procedia PDF Downloads 730
14663 How Western Donors Allocate Official Development Assistance: New Evidence From a Natural Language Processing Approach

Authors: Daniel Benson, Yundan Gong, Hannah Kirk

Abstract:

Advancement in national language processing techniques has led to increased data processing speeds, and reduced the need for cumbersome, manual data processing that is often required when processing data from multilateral organizations for specific purposes. As such, using named entity recognition (NER) modeling and the Organisation of Economically Developed Countries (OECD) Creditor Reporting System database, we present the first geotagged dataset of OECD donor Official Development Assistance (ODA) projects on a global, subnational basis. Our resulting data contains 52,086 ODA projects geocoded to subnational locations across 115 countries, worth a combined $87.9bn. This represents the first global, OECD donor ODA project database with geocoded projects. We use this new data to revisit old questions of how ‘well’ donors allocate ODA to the developing world. This understanding is imperative for policymakers seeking to improve ODA effectiveness.

Keywords: international aid, geocoding, subnational data, natural language processing, machine learning

Procedia PDF Downloads 66
14662 Play-Based Approaches to Stimulate Language

Authors: Sherri Franklin-Guy

Abstract:

The emergence of language in young children has been well-documented and play-based activities that support its continued development have been utilized in the clinic-based setting. Speech-language pathologists have long used such activities to stimulate the production of language in children with speech and language disorders via modeling and elicitation tasks. This presentation will examine the importance of play in the development of language in young children, including social and pragmatic communication. Implications for clinicians and educators will be discussed.

Keywords: language development, language stimulation, play-based activities, symbolic play

Procedia PDF Downloads 234
14661 Quantitative, Preservative Methodology for Review of Interview Transcripts Using Natural Language Processing

Authors: Rowan P. Martnishn

Abstract:

During the execution of a National Endowment of the Arts grant, approximately 55 interviews were collected from professionals across various fields. These interviews were used to create deliverables – historical connections for creations that began as art and evolved entirely into computing technology. With dozens of hours’ worth of transcripts to be analyzed by qualitative coders, a quantitative methodology was created to sift through the documents. The initial step was to both clean and format all the data. First, a basic spelling and grammar check was applied, as well as a Python script for normalized formatting which used an open-source grammatical formatter to make the data as coherent as possible. 10 documents were randomly selected to manually review, where words often incorrectly translated during the transcription were recorded and replaced throughout all other documents. Then, to remove all banter and side comments, the transcripts were spliced into paragraphs (separated by change in speaker) and all paragraphs with less than 300 characters were removed. Secondly, a keyword extractor, a form of natural language processing where significant words in a document are selected, was run on each paragraph for all interviews. Every proper noun was put into a data structure corresponding to that respective interview. From there, a Bidirectional and Auto-Regressive Transformer (B.A.R.T.) summary model was then applied to each paragraph that included any of the proper nouns selected from the interview. At this stage the information to review had been sent from about 60 hours’ worth of data to 20. The data was further processed through light, manual observation – any summaries which proved to fit the criteria of the proposed deliverable were selected, as well their locations within the document. This narrowed that data down to about 5 hours’ worth of processing. The qualitative researchers were then able to find 8 more connections in addition to our previous 4, exceeding our minimum quota of 3 to satisfy the grant. Major findings of the study and subsequent curation of this methodology raised a conceptual finding crucial to working with qualitative data of this magnitude. In the use of artificial intelligence there is a general trade off in a model between breadth of knowledge and specificity. If the model has too much knowledge, the user risks leaving out important data (too general). If the tool is too specific, it has not seen enough data to be useful. Thus, this methodology proposes a solution to this tradeoff. The data is never altered outside of grammatical and spelling checks. Instead, the important information is marked, creating an indicator of where the significant data is without compromising the purity of it. Secondly, the data is chunked into smaller paragraphs, giving specificity, and then cross-referenced with the keywords (allowing generalization over the whole document). This way, no data is harmed, and qualitative experts can go over the raw data instead of using highly manipulated results. Given the success in deliverable creation as well as the circumvention of this tradeoff, this methodology should stand as a model for synthesizing qualitative data while maintaining its original form.

Keywords: B.A.R.T.model, keyword extractor, natural language processing, qualitative coding

Procedia PDF Downloads 21
14660 The Psycho-Linguistic Aspect of Translation Gaps in Teaching English for Specific Purposes

Authors: Elizaveta Startseva, Elena Notina, Irina Bykova, Valentina Ulyumdzhieva, Natallia Zhabo

Abstract:

With the various existing models of intercultural communication that contain a vast number of stages for foreign language acquisition, there is a need for conscious perception of the foreign culture. Such a process is associated with the emergence of linguistic conflict with the consistent students’ desire to solve the problem of the language differences, along with cultural discrepancies. The aim of this study is to present the modern ways and methods of removing psycholinguistic conflict through skills development in professional translation and intercultural communication. The study was conducted in groups of 1-4-year students of Medical Institute and Agro-Technological Institute RUDN university. In the course of training, students got knowledge in such disciplines as basic grammar and vocabulary of the English language, phonetics, lexicology, introduction to linguistics, theory of translation, annotating and referencing media texts and texts in specialty. The students learned to present their research work, participated in the University and exit conferences with their reports and presentations. Common strategies of removing linguistic and cultural conflict can be attributed to the development of such abilities of a language personality as a commitment to communication and cooperation, the formation of cultural awareness and empathy of other cultures of the individual, realistic self-esteem, emotional stability, tolerance, etc. The process of mastering a foreign language and culture of the target language leads to a reduplication of linguistic identity, which leads to successive formation of the so-called 'secondary linguistic personality.' In our study, we tried to approach the problem comprehensively, focusing on the translation gaps for technical and non-technical language still missing such a typology which could classify all of the lacunas on the same principle. When obtaining the background knowledge, students learn to overcome the difficulties posed by the national-specific and linguistic differences of cultures in contact, i.e., to eliminate the gaps (to fill in and compensate). Compensation gaps is a means of fixing it, the initial phase of elimination, followed in some cases and some not is filling semantic voids (plenus). The concept of plenus occurs in most cases of translation gaps, for example in the transcription and transliteration of (intercultural and exoticism), the replication (reproduction of the morphemic structure of words or idioms. In all the above cases the task of the translator is to ensure an identical response of the receptors of the original and translated texts, since any statement is created with the goal of obtaining communicative effect, and hence pragmatic potential is the most important part of its contents. The practical value of our work lies in improving the methodology of teaching English for specific purposes on the basis of psycholinguistic concept of the secondary language personality.

Keywords: lacuna, language barrier, plenus, secondary language personality

Procedia PDF Downloads 281
14659 Kinaesthetic Method in Apprenticeship Training: Support for Finnish Learning in Vocational Education

Authors: Inkeri Jääskeläinen

Abstract:

The purpose of this study is to shed light on what is it like to study in apprenticeship training using Finnish as second language. This study examines the stories and experiences of apprenticeship students learning and studying Finnish as part of their vocational studies. Also, this pilot study examines the effects of learning to pronounce Finnish through body motions and gestures. Many foreign students choose apprenticeships and start vocational training too early, while their language skills in Finnish are still very weak. Both duties at work and school assignments require reasonably good general language skills (B1.1) and, especially at work, language skills are also a safety issue. At work students should be able to simultaneously learn Finnish and do vocational studies in a noisy, demanding, and stressing environment. Learning and understanding new things is very challenging under these circumstances and sometimes students get exhausted and experience a lot of stress - which makes learning even more difficult. Students are different from each other and so are their ways to learn. Thereafter, one of the most important features of apprenticeship training and second language learning is good understanding of adult learners and their needs. Kinaesthetic methods are an effective way to support adult students’ cognitive skills and make learning more relaxing and fun. Empirical findings show that language learning can indeed be supported physical ways, by body motions and gestures. The method used here, named TFFL (Touch and Feel Foreign Languages), was designed to support adult language learning, to correct or prevent language fossilization and to help the student to manage emotions. Finnish is considered as a difficult language to learn, mostly because it is so different from nearly all other languages. Many learners complain that they are lost or confused and there is a need to find a way to simultaneously learn the language and to handle negative emotion which come from Finnish language and the learning process itself. Due to the nature of Finnish language good pronunciation skills are needed just to understand the way the language work. Movements (body movements etc.) are a natural part of many cultures but not Finnish – In Finland students have traditionally been expected to stay still and that is not a natural way for many foreign students. However, kinaesthetic TFFL method proved out to be a useful way to help some L2 students to feel phonemes, rhythm and intonation, to improve their Finnish and, thereby, also to successfully complete their vocational studies.

Keywords: Finnish, fossilization, interference, kinaesthetic method

Procedia PDF Downloads 100
14658 Meaningful Habit for EFL Learners

Authors: Ana Maghfiroh

Abstract:

Learning a foreign language needs a big effort from the learner itself to make their language ability grows better day by day. Among those, they also need a support from all around them including teacher, friends, as well as activities which support them to speak the language. When those activities developed well as a habit which are done regularly, it will help improving the students’ language competence. It was a qualitative research which aimed to find out and describe some activities implemented in Pesantren Al Mawaddah, Ponorogo, in order to teach the students a foreign language. In collecting the data, the researcher used interview, questionnaire, and documentation. From the study, it was found that Pesantren Al Mawaddah had successfully built the language habit on the students to speak the target language. More than 15 hours a day students were compelled to speak foreign language, Arabic or English, in turn. It aimed to habituate the students to keep in touch with the target language. The habit was developed through daily language activities, such as dawn vocabs giving, dictionary handling, daily language use, speech training and language intensive course, daily language input, and night vocabs memorizing. That habit then developed the students awareness towards the language learned as well as promoted their language mastery.

Keywords: habit, communicative competence, daily language activities, Pesantren

Procedia PDF Downloads 532
14657 Kinaesthetic Method in Apprenticeship Training: Support for Finnish Learning in Vocational Education and Training

Authors: Inkeri Jaaskelainen

Abstract:

The purpose of this study is to shed light on what it is like to study in apprenticeship training using Finnish as a second language. This study examines the stories and experiences of apprenticeship students learning and studying Finnish as part of their vocational studies. Also, this pilot study examines the effects of learning to pronounce Finnish through body motions and gestures. Many foreign students choose apprenticeships and start vocational training too early, while their language skills in Finnish are still very weak. Both duties at work and school assignments require reasonably good general language skills (B1.1), and, especially at work, language skills are also a safety issue. At work, students should be able to simultaneously learn Finnish and do vocational studies in a noisy, demanding, and stressful environment. Learning and understanding new things is very challenging under these circumstances and sometimes students get exhausted and experience a lot of stress - which makes learning even more difficult. Students are different from each other and so are their ways to learn. Thereafter, one of the most important features of apprenticeship training and second language learning is a good understanding of adult learners and their needs. Kinaesthetic methods are an effective way to support adult students’ cognitive skills and make learning more relaxing and fun. Empirical findings show that language learning can indeed be supported in physical ways, by body motions and gestures. The method used here, named TFFL (Touch and Feel Foreign Languages), was designed to support adult language learning, to correct or prevent language fossilization, and to help the student to manage emotions. Finnish is considered as a difficult language to learn, mostly because it is so different from nearly all other languages. Many learners complain that they are lost or confused and there is a need to find a way to simultaneously learn the language and to handle negative emotion that comes from the Finnish language and the learning process itself. Due to the nature of the Finnish language, good pronunciation skills are needed just to understand the way the language work. Movements (body movements etc.) are a natural part of many cultures, but not Finnish. In Finland, students have traditionally been expected to stay still, and that is not a natural way for many foreign students. However, the kinaesthetic TFFL method proved out to be a useful way to help some L2 students to feel phonemes, rhythm, and intonation, to improve their Finnish, and, thereby, also to successfully complete their vocational studies.

Keywords: Finnish, fossilization, interference, kinaesthetic method

Procedia PDF Downloads 130
14656 Formation of Clipped Forms in Hausa Language

Authors: Maryam Maimota Shehu

Abstract:

Words are the basic building blocks of a language. In everyday usage of a language, words are used, and new words are formed and reformed in order to contain and accommodate all entities, phenomena, qualities and every aspect of the entire life. Despite the fact that many studies have been conducted on morphological processes in Hausa language. Most of the works concentrated on borrowing, affixation, reduplication and derivation, but clipping has been neglected to the extent that only a few scholars sited some examples in the language. Therefore, the current study investigates and examines clipping as one of the word formation processes fully found in the language. The study focuses its main attention on clipping as a word-formation process and how this process is used adequately in the formation of words and their occurrence in Hausa sentences. In order to achieve the aims, the research answered these questions: 1) is clipping used as process of word formation in Hausa? 2) What are the words formed using this process? This study utilizes the Natural Morphology Theory proposed by Dressler, (1985) which was adopted by belly (2007). The data of this study have been collected from newspaper articles, novels, and written literature of Hausa language. Based on the findings, this study found out that, there exist many kinds of words formed in Hausa language using clipping in sentence and discuss, which previous findings did not either reveals, or explain in detail. Other part of the finding shows that clipping in Hausa language occurs on nouns, verbs, adjectives, reduplicated words and compounds while retains their meanings and grammatical classes.

Keywords: clipping, Hausa language, morphology, word formation processes

Procedia PDF Downloads 458
14655 Variation in Complement Order in English: Implications for Interlanguage Syntax

Authors: Juliet Udoudom

Abstract:

Complement ordering principles of natural language phrases (XPs) stipulate that Head terms be consistently placed phrase initially or phrase-finally, yielding two basic theoretical orders – Head – Complement order or Complement – Head order. This paper examines the principles which determine complement ordering in English V- and N-bar structures. The aim is to determine the extent to which complement linearisations in the two phrase types are consistent with the two theoretical orders outlined above given the flexible and varied nature of natural language structures. The objective is to see whether there are variation(s) in the complement linearisations of the XPs studied and the implications which such variations hold for the inter-language syntax of English and Ibibio. A corpus-based approach was employed in obtaining the English data. V- and -N – bar structures containing complement structures were isolated for analysis. Data were examined from the perspective of the X-bar and Government – theories of Chomsky’s (1981) Government-Binding format. Findings from the analysis show that in V – bar structures in English, heads are consistently placed phrase – initially yielding a Head – Complement order; however, complement linearisation in the N – bar structures studied exhibited parametric variations. Thus, in some N – bar structures in English the nominal head is ordered to the left whereas in others, the head term occurs to the right. It may therefore be concluded that the principles which determine complement ordering are both Language – Particular and Phrase – specific following insights provided within Phrasal Syntax.

Keywords: complement order, complement–head order, head–complement order, language–particular principles

Procedia PDF Downloads 340
14654 Programming with Grammars

Authors: Peter M. Maurer Maurer

Abstract:

DGL is a context free grammar-based tool for generating random data. Many types of simulator input data require some computation to be placed in the proper format. For example, it might be necessary to generate ordered triples in which the third element is the sum of the first two elements, or it might be necessary to generate random numbers in some sorted order. Although DGL is universal in computational power, generating these types of data is extremely difficult. To overcome this problem, we have enhanced DGL to include features that permit direct computation within the structure of a context free grammar. The features have been implemented as special types of productions, preserving the context free flavor of DGL specifications.

Keywords: DGL, Enhanced Context Free Grammars, Programming Constructs, Random Data Generation

Procedia PDF Downloads 141
14653 Rethinking Literary Language: A Philsophicus-Logico Approach. The Novel ‘’ Sympathizer ‘’ as a Case Study

Authors: Oublal Ali

Abstract:

Due scholarly attention given to Ludwig Wittgenstein since the appearance of Tractatus is resulted from revolutionary shift he has made in the conception of language. True, his first and foremost concern was to solve the issue of language philosophers failed to recognize. Not only Tracturain’s approach to language that argues for philosophers failure of understanding the logic of language, but also his later conception which is developed in philosophical investigations and the reminder of all his remarks. On such a basis, it is claimed that Wittgenstein’s theory of language should not be confined to the language within philosophical streams with this premise we therefore propose to analytically read one of the literary propositions in the sympathizer as linguistic corpus. Our investigation of the literary proposition weaves us into claiming that Wittgenstein’s language games -later philosophy- is apposite to the analysis of literary works thanks to the shift Wittgenstein has made from demarcated use of language to the multiplicity and non-uniformity of its use.

Keywords: language, context, use, language games, literary propositions

Procedia PDF Downloads 124
14652 Predicting Personality and Psychological Distress Using Natural Language Processing

Authors: Jihee Jang, Seowon Yoon, Gaeun Son, Minjung Kang, Joon Yeon Choeh, Kee-Hong Choi

Abstract:

Background: Self-report multiple choice questionnaires have been widely utilized to quantitatively measure one’s personality and psychological constructs. Despite several strengths (e.g., brevity and utility), self-report multiple-choice questionnaires have considerable limitations in nature. With the rise of machine learning (ML) and Natural language processing (NLP), researchers in the field of psychology are widely adopting NLP to assess psychological constructs to predict human behaviors. However, there is a lack of connections between the work being performed in computer science and that psychology due to small data sets and unvalidated modeling practices. Aims: The current article introduces the study method and procedure of phase II, which includes the interview questions for the five-factor model (FFM) of personality developed in phase I. This study aims to develop the interview (semi-structured) and open-ended questions for the FFM-based personality assessments, specifically designed with experts in the field of clinical and personality psychology (phase 1), and to collect the personality-related text data using the interview questions and self-report measures on personality and psychological distress (phase 2). The purpose of the study includes examining the relationship between natural language data obtained from the interview questions, measuring the FFM personality constructs, and psychological distress to demonstrate the validity of the natural language-based personality prediction. Methods: The phase I (pilot) study was conducted on fifty-nine native Korean adults to acquire the personality-related text data from the interview (semi-structured) and open-ended questions based on the FFM of personality. The interview questions were revised and finalized with the feedback from the external expert committee, consisting of personality and clinical psychologists. Based on the established interview questions, a total of 425 Korean adults were recruited using a convenience sampling method via an online survey. The text data collected from interviews were analyzed using natural language processing. The results of the online survey, including demographic data, depression, anxiety, and personality inventories, were analyzed together in the model to predict individuals’ FFM of personality and the level of psychological distress (phase 2).

Keywords: personality prediction, psychological distress prediction, natural language processing, machine learning, the five-factor model of personality

Procedia PDF Downloads 76
14651 The Role of Specificity in Mastering the English Article System

Authors: Sugene Kim

Abstract:

The English articles are taught as a binary system based on nominal countability and definiteness. Despite the detailed rules of prescriptive grammar, it has been consistently reported in the literature that their correct usage is extremely difficult to master even for advanced learners of English as a second language (ESL) or a foreign language (EFL). Given that an English sentence (except for an imperative) cannot be constructed without a noun, which is always paired with one of the indefinite, definite, and zero articles; it is essential to understand specifically what causes ESL/EFL learners to misuse them. To that end, this study examined EFL learners’ article use employing a one-group pre–post-test design. Forty-three Korean college students received instruction on correct English article usage for two 75-minute classes employing the binary schema set up for the study. They also practiced in class how to apply the rules as instructed. Then, the participants were assigned a forced-choice elicitation task, which was also used as a pre-test administered three months prior to the instruction. Unlike the pre-test on which they only chose the correct article for each of the 40 items, the post-instruction task additionally asked them to give written accounts of their decision-making procedure to choose the article as they did. The participants’ performance was scored manually by checking whether the answer given is correct or incorrect, and their written comments were first categorized using thematic analysis and then ranked by frequency. The analyses of the performance on the two tasks and the written think-aloud data suggested that EFL learners exhibit fluctuation between specificity and definiteness, overgeneralizing the use of the definite article for almost all cataphoric references. It was apparent that they have trouble distinguishing from the two concepts possibly because the former is almost never introduced in the grammar books or classes designed for ESL/EFL learners. Particularly, most participants were found to be ignorant of the possibility of using nouns as [+specific, –definite]. Not surprisingly, the correct answer rates for such nouns averaged out at 33% and 46% on the pre- and post-tests, respectively, which narrowly reach half the overall mean correct answer rates of 65% on the pre-test and 81% on the post-test. In addition, correct article use for specific indefinites was most impermeable to instruction when compared with nouns used as [–specific, –definite] or [± specific, +definite]. Such findings underline the necessity for expanding the binary schema to a ternary form that incorporates the specificity feature, albeit not morphologically marked in the English language.

Keywords: countability, definiteness, English articles, specificity, ternary system

Procedia PDF Downloads 118
14650 Efficiency of a Semantic Approach in Teaching Foreign Languages

Authors: Genady Shlomper

Abstract:

During the process of language teaching, each teacher faces some general and some specific problems. Some of these problems are mutual to all languages because they yield to the rules of cognition, conscience, perception, understanding and memory; to the physiological and psychological principles pertaining to the human race irrespective of origin and nationality. Still, every language is a distinctive system, possessing individual properties and an obvious identity, as a result of a development in specific natural, geographical, cultural and historical conditions. The individual properties emerge in the script, in the phonetics, morphology and syntax. All these problems can and should be a subject of a detailed research and scientific analysis, mainly from practical considerations and language teaching requirements. There are some formidable obstacles in the language acquisition process. Among the first to be mentioned is the existence of concepts and entire categories in foreign languages, which are absent in the language of the students. Such phenomena reflect specific ways of thinking and the world-outlook, which were shaped during the evolution. Hindi is the national language of India, which belongs to the group of Indo-Iranian languages from the Indo-European family of languages. The lecturer has gained experience in teaching Hindi language to native speakers of Uzbek, Russian and Hebrew languages. He will show the difficulties in the field of phonetics, morphology and syntax, which the students have to deal with during the acquisition of the language. In the proposed lecture the lecturer will share his experience in making the process of language teaching more efficient by using non-formal semantic approach.

Keywords: applied linguistics, foreign language teaching, language teaching methodology, semantics

Procedia PDF Downloads 345
14649 A Context-Centric Chatbot for Cryptocurrency Using the Bidirectional Encoder Representations from Transformers Neural Networks

Authors: Qitao Xie, Qingquan Zhang, Xiaofei Zhang, Di Tian, Ruixuan Wen, Ting Zhu, Ping Yi, Xin Li

Abstract:

Inspired by the recent movement of digital currency, we are building a question answering system concerning the subject of cryptocurrency using Bidirectional Encoder Representations from Transformers (BERT). The motivation behind this work is to properly assist digital currency investors by directing them to the corresponding knowledge bases that can offer them help and increase the querying speed. BERT, one of newest language models in natural language processing, was investigated to improve the quality of generated responses. We studied different combinations of hyperparameters of the BERT model to obtain the best fit responses. Further, we created an intelligent chatbot for cryptocurrency using BERT. A chatbot using BERT shows great potential for the further advancement of a cryptocurrency market tool. We show that the BERT neural networks generalize well to other tasks by applying it successfully to cryptocurrency.

Keywords: bidirectional encoder representations from transformers, BERT, chatbot, cryptocurrency, deep learning

Procedia PDF Downloads 137
14648 The Use of Corpora in Improving Modal Verb Treatment in English as Foreign Language Textbooks

Authors: Lexi Li, Vanessa H. K. Pang

Abstract:

This study aims to demonstrate how native and learner corpora can be used to enhance modal verb treatment in EFL textbooks in mainland China. It contributes to a corpus-informed and learner-centered design of grammar presentation in EFL textbooks that enhances the authenticity and appropriateness of textbook language for target learners. The linguistic focus is will, would, can, could, may, might, shall, should, must. The native corpus is the spoken component of BNC2014 (hereafter BNCS2014). The spoken part is chosen because pedagogical purpose of the textbooks is communication-oriented. Using the standard query option of CQPweb, 5% of each of the nine modals was sampled from BNCS2014. The learner corpus is the POS-tagged Ten-thousand English Compositions of Chinese Learners (TECCL). All the essays under the 'secondary school' section were selected. A series of five secondary coursebooks comprise the textbook corpus. All the data in both the learner and the textbook corpora are retrieved through the concordance functions of WordSmith Tools (version, 5.0). Data analysis was divided into two parts. The first part compared the patterns of modal verbs in the textbook corpus and BNC2014 with respect to distributional features, semantic functions, and co-occurring constructions to examine whether the textbooks reflect the authentic use of English. Secondly, the learner corpus was analyzed in terms of the use (distributional features, semantic functions, and co-occurring constructions) and the misuse (syntactic errors, e.g., she can sings*.) of the nine modal verbs to uncover potential difficulties that confront learners. The analysis of distribution indicates several discrepancies between the textbook corpus and BNCS2014. The first four most frequent modal verbs in BNCS2014 are can, would, will, could, while can, will, should, could are the top four in the textbooks. Most strikingly, there is an unusually high proportion of can (41.1%) in the textbooks. The results on different meanings shows that will, would and must are the most problematic. For example, for will, the textbooks contain 20% more occurrences of 'volition' and 20% less of 'prediction' than those in BNCS2014. Regarding co-occurring structures, the textbooks over-represented the structure 'modal +do' across the nine modal verbs. Another major finding is that the structure of 'modal +have done' that frequently co-occur with could, would, should, and must is underused in textbooks. Besides, these four modal verbs are the most difficult for learners, as the error analysis shows. This study demonstrates how the synergy of native and learner corpora can be harnessed to improve EFL textbook presentation of modal verbs in a way that textbooks can provide not only authentic language used in natural discourse but also appropriate design tailed for the needs of target learners.

Keywords: English as Foreign Language, EFL textbooks, learner corpus, modal verbs, native corpus

Procedia PDF Downloads 137
14647 Acquisition of Overt Pronoun Constraint in L2 Turkish by Adult Korean Speakers

Authors: Oktay Cinar

Abstract:

The aim of this study is to investigate the acquisition of Overt Pronoun Constraint (OPC) by adult Korean L2 Turkish speakers in order to find out how constraints regulating the syntax of null and overt subjects are acquired. OPC is claimed to be a universal feature of all null subject languages restricting the co-indexation between overt embedded pronoun and quantified or wh-question antecedents. However, there is no such restriction when the embedded subject is null or the antecedent is a referential subject. Considered as a principle of Universal Grammar (UG), OPC knowledge of L2 speakers has been widely tested with different language pairs. In the light of previous studies on OPC, it can be argued that L2 learners display early sensitivity to OPC constraints during their interlanguage grammar development. Concerning this, the co-indexation between overt embedded pronoun o (third person pronoun) and referential matrix subject is claimed to be controversial in Turkish, which poses problems with the universality of OPC. However, the current study argues against this claim by providing evidence from advanced Korean speakers that OPC is universal to all null subject languages and OPC knowledge can be accessed with direct access to UG. In other words, the performances of adult Korean speakers on the syntax of null and overt subjects are tested to support this claim. In order to test this, OPC task is used. 15 advanced speakers and a control group of adult native Turkish participants are instructed to determine the co-reference relationship between the subject of embedded clause, either overt pronominal o or null, and the subject of the matrix clause, either quantified pronoun and wh-question or referential antecedent. They are asked to select the interpretation of the embedded subject, either as the same person as in the matrix subject or another person who is not the same person in the matrix subject. These relations are represented with four conditions, and each condition has four questions (16 questions in total). The results claim that both control group and Korean L2 Turkish speakers display sensitivity to all constraints that OPC has, which suggests that OPC works in Turkish as well.

Keywords: adult Korean speakers, binding theory, generative second language acquisition, overt pronoun constraint

Procedia PDF Downloads 305
14646 Enhancing Word Meaning Retrieval Using FastText and Natural Language Processing Techniques

Authors: Sankalp Devanand, Prateek Agasimani, Shamith V. S., Rohith Neeraje

Abstract:

Machine translation has witnessed significant advancements in recent years, but the translation of languages with distinct linguistic characteristics, such as English and Sanskrit, remains a challenging task. This research presents the development of a dedicated English-to-Sanskrit machine translation model, aiming to bridge the linguistic and cultural gap between these two languages. Using a variety of natural language processing (NLP) approaches, including FastText embeddings, this research proposes a thorough method to improve word meaning retrieval. Data preparation, part-of-speech tagging, dictionary searches, and transliteration are all included in the methodology. The study also addresses the implementation of an interpreter pattern and uses a word similarity task to assess the quality of word embeddings. The experimental outcomes show how the suggested approach may be used to enhance word meaning retrieval tasks with greater efficacy, accuracy, and adaptability. Evaluation of the model's performance is conducted through rigorous testing, comparing its output against existing machine translation systems. The assessment includes quantitative metrics such as BLEU scores, METEOR scores, Jaccard Similarity, etc.

Keywords: machine translation, English to Sanskrit, natural language processing, word meaning retrieval, fastText embeddings

Procedia PDF Downloads 35
14645 Speaking Difficulties Encountered by EFL Learners in Secondary School in Morocco

Authors: Bellali Assia, Bellali Fatima

Abstract:

Speaking is one of the most difficult English skills for non-English learners. This study investigated English-speaking difficulties encountered by non-English secondary school students in a private school in Casablanca, Morocco. The subjects were students of 63 (male and female) from 2ed year classes level. It also aims to investigate the degree of main speaking difficulties and the factors effecting non-English students to speak English. This research used a descriptive qualitative and quantitative approach with a questionnaire and an interview to collect the data. In linguistically related difficulties, there were four difficulties, namely vocabulary, grammar, conversation and pronunciation. The results revealed that there were 40.32% of students agreed that they do not have sufficient grammar knowledge, 45.16% of students agreed that they do not have enough vocabulary, 45.90% of students agreed that they have difficulty in conversation, and 39.34% of students agreed that they have poor pronunciation. Also, the results indicated that 63.33 % of students agreed that they have problems with self-confidence. The factors causing the problem of speaking English in this study were lack of general knowledge, lack of speaking practice, fear of mistakes and grammar practice, low participation, shyness, nervousness, fear of criticism, and unfamiliar word pronunciation. Furthermore, recommendations and suggestions were presented to solve the problem and eliminate difficulties for teachers and students.

Keywords: English speaking, difficulties, factors, non-English students

Procedia PDF Downloads 10
14644 Time Series Forecasting (TSF) Using Various Deep Learning Models

Authors: Jimeng Shi, Mahek Jain, Giri Narasimhan

Abstract:

Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed-length window in the past as an explicit input. In this paper, we study how the performance of predictive models changes as a function of different look-back window sizes and different amounts of time to predict the future. We also consider the performance of the recent attention-based Transformer models, which have had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (RNN, LSTM, GRU, and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the UCI website, which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean Average Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.

Keywords: air quality prediction, deep learning algorithms, time series forecasting, look-back window

Procedia PDF Downloads 147
14643 Large Language Model Powered Chatbots Need End-to-End Benchmarks

Authors: Debarag Banerjee, Pooja Singh, Arjun Avadhanam, Saksham Srivastava

Abstract:

Autonomous conversational agents, i.e., chatbots, are becoming an increasingly common mechanism for enterprises to provide support to customers and partners. In order to rate chatbots, especially ones powered by Generative AI tools like Large Language Models (LLMs), we need to be able to accurately assess their performance. This is where chatbot benchmarking becomes important. In this paper, authors propose the use of a benchmark that they call the E2E (End to End) benchmark and show how the E2E benchmark can be used to evaluate the accuracy and usefulness of the answers provided by chatbots, especially ones powered by LLMs. The authors evaluate an example chatbot at different levels of sophistication based on both our E2E benchmark as well as other available metrics commonly used in the state of the art and observe that the proposed benchmark shows better results compared to others. In addition, while some metrics proved to be unpredictable, the metric associated with the E2E benchmark, which uses cosine similarity, performed well in evaluating chatbots. The performance of our best models shows that there are several benefits of using the cosine similarity score as a metric in the E2E benchmark.

Keywords: chatbot benchmarking, end-to-end (E2E) benchmarking, large language model, user centric evaluation.

Procedia PDF Downloads 59
14642 Contextual Toxicity Detection with Data Augmentation

Authors: Julia Ive, Lucia Specia

Abstract:

Understanding and detecting toxicity is an important problem to support safer human interactions online. Our work focuses on the important problem of contextual toxicity detection, where automated classifiers are tasked with determining whether a short textual segment (usually a sentence) is toxic within its conversational context. We use “toxicity” as an umbrella term to denote a number of variants commonly named in the literature, including hate, abuse, offence, among others. Detecting toxicity in context is a non-trivial problem and has been addressed by very few previous studies. These previous studies have analysed the influence of conversational context in human perception of toxicity in controlled experiments and concluded that humans rarely change their judgements in the presence of context. They have also evaluated contextual detection models based on state-of-the-art Deep Learning and Natural Language Processing (NLP) techniques. Counterintuitively, they reached the general conclusion that computational models tend to suffer performance degradation in the presence of context. We challenge these empirical observations by devising better contextual predictive models that also rely on NLP data augmentation techniques to create larger and better data. In our study, we start by further analysing the human perception of toxicity in conversational data (i.e., tweets), in the absence versus presence of context, in this case, previous tweets in the same conversational thread. We observed that the conclusions of previous work on human perception are mainly due to data issues: The contextual data available does not provide sufficient evidence that context is indeed important (even for humans). The data problem is common in current toxicity datasets: cases labelled as toxic are either obviously toxic (i.e., overt toxicity with swear, racist, etc. words), and thus context does is not needed for a decision, or are ambiguous, vague or unclear even in the presence of context; in addition, the data contains labeling inconsistencies. To address this problem, we propose to automatically generate contextual samples where toxicity is not obvious (i.e., covert cases) without context or where different contexts can lead to different toxicity judgements for the same tweet. We generate toxic and non-toxic utterances conditioned on the context or on target tweets using a range of techniques for controlled text generation(e.g., Generative Adversarial Networks and steering techniques). On the contextual detection models, we posit that their poor performance is due to limitations on both of the data they are trained on (same problems stated above) and the architectures they use, which are not able to leverage context in effective ways. To improve on that, we propose text classification architectures that take the hierarchy of conversational utterances into account. In experiments benchmarking ours against previous models on existing and automatically generated data, we show that both data and architectural choices are very important. Our model achieves substantial performance improvements as compared to the baselines that are non-contextual or contextual but agnostic of the conversation structure.

Keywords: contextual toxicity detection, data augmentation, hierarchical text classification models, natural language processing

Procedia PDF Downloads 166