Search results for: Chinese natural language processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12408

Search results for: Chinese natural language processing

12318 An Integrated Approach to Syllabus Design for Business Chinese

Authors: Dongshuo Wang, Minjie Xing

Abstract:

International businesses prefer to hire people who speak more than one language. With the booming of China’s market, industries and trade, business leaders are looking for people who can speak Chinese and operate successfully in a Chinese cultural context, and therefore an increasing number of tertiary students choose a Business Chinese (BC) course. As a result, BC syllabus design is urgently needed. What business knowledge should be included in China’s context? What aspects of BC culture should be included? How much Chinese language should be introduced to conduct business in China? With these research questions, this research explores a syllabus design that integrates the three aspects of subject knowledge of business in communication, business practice including the procedure of and strategies for communicating business in practice and language skills including the disciplinary and professional contexts in which linguistic choices are made. After literature review and consultancy with China-related business professionals, senior staff from business schools and representatives of students, the authors of this paper, together with language tutors drafted a syllabus based on the integrated approach to include subject knowledge, business practice and language skills. Due to the nature of this research which requires trial/test and detailed description for each correction, qualitative methods are adopted. Two in-depth focus group interviews (with 2 staff and 4 students in each group), and 18 individual interviews (8 staff and 10 students) were conducted. QDA was used for systematizing, organizing, and analysing qualitative data. It was discovered that the business knowledge related to a Chinese cultural context, including face value, networking skills, strategic plans for signing a contract, marketing, sales, and after-sale service, should be introduced through lectures and seminars; business practice could be implemented by students setting up their own companies, virtual or real; and language skills would be trained via writing business messages and presenting their companies in fairs and exhibitions. After a longitudinal study of trials and amendments for three years from 2013 to 2016, the syllabus was approved by staff and students and the university. Students appreciated the syllabus, as they could apply the subject knowledge into practice by using it in their own companies and Chinese language was used throughout the process. The syllabus is now ready to be used in universities offering BC, and the designing process can be applied to other new courses as well.

Keywords: business Chinese, syllabus design, business knowledge, language skills

Procedia PDF Downloads 320
12317 Digitalisation of the Railway Industry: Recent Advances in the Field of Dialogue Systems: Systematic Review

Authors: Andrei Nosov

Abstract:

This paper discusses the development directions of dialogue systems within the digitalisation of the railway industry, where technologies based on conversational AI are already potentially applied or will be applied. Conversational AI is one of the popular natural language processing (NLP) tasks, as it has great prospects for real-world applications today. At the same time, it is a challenging task as it involves many areas of NLP based on complex computations and deep insights from linguistics and psychology. In this review, we focus on dialogue systems and their implementation in the railway domain. We comprehensively review the state-of-the-art research results on dialogue systems and analyse them from three perspectives: type of problem to be solved, type of model, and type of system. In particular, from the perspective of the type of tasks to be solved, we discuss characteristics and applications. This will help to understand how to prioritise tasks. In terms of the type of models, we give an overview that will allow researchers to become familiar with how to apply them in dialogue systems. By analysing the types of dialogue systems, we propose an unconventional approach in contrast to colleagues who traditionally contrast goal-oriented dialogue systems with open-domain systems. Our view focuses on considering retrieval and generative approaches. Furthermore, the work comprehensively presents evaluation methods and datasets for dialogue systems in the railway domain to pave the way for future research. Finally, some possible directions for future research are identified based on recent research results.

Keywords: digitalisation, railway, dialogue systems, conversational AI, natural language processing, natural language understanding, natural language generation

Procedia PDF Downloads 33
12316 Motivation and Quality Teaching of Chinese Language: Analysis of Secondary School Studies

Authors: Robyn Moloney, HuiLing Xu

Abstract:

Many countries wish to produce Asia-literate citizens, through language education. International contexts of Chinese language education are seeking pedagogical innovation to meet local contextual factors frequently holding back learner success. In multicultural Australia, innovative pedagogy is urgently needed to support motivation in sustained study, with greater strategic integration of technology. This research took a qualitative approach to identify need and solutions. The paper analyses strategies that three secondary school teachers are adopting to meet specific challenges in the Australian context. The data include teacher interviews, classroom observations and student interviews. We highlight the use of task-based learning and differentiated teaching for multilevel classes, and the role which digital technologies play in facilitating both areas. The strategy examples are analysed in reference both to a research-based framework for describing quality teaching, and to current understandings of motivation in language learning. The analysis of data identifies learning featuring deep knowledge, higher-order thinking, engagement, social support, utilisation of background knowledge, and connectedness, all of which work towards the learners having a sense of autonomy and an imagination of becoming an adult Chinese language user.

Keywords: Chinese pedagogy, digital technologies, motivation, secondary school

Procedia PDF Downloads 231
12315 Controlling Drone Flight Missions through Natural Language Processors Using Artificial Intelligence

Authors: Sylvester Akpah, Selasi Vondee

Abstract:

Unmanned Aerial Vehicles (UAV) as they are also known, drones have attracted increasing attention in recent years due to their ubiquitous nature and boundless applications in the areas of communication, surveying, aerial photography, weather forecasting, medical delivery, surveillance amongst others. Operated remotely in real-time or pre-programmed, drones can fly autonomously or on pre-defined routes. The application of these aerial vehicles has successfully penetrated the world due to technological evolution, thus a lot more businesses are utilizing their capabilities. Unfortunately, while drones are replete with the benefits stated supra, they are riddled with some problems, mainly attributed to the complexities in learning how to master drone flights, collision avoidance and enterprise security. Additional challenges, such as the analysis of flight data recorded by sensors attached to the drone may take time and require expert help to analyse and understand. This paper presents an autonomous drone control system using a chatbot. The system allows for easy control of drones using conversations with the aid of Natural Language Processing, thus to reduce the workload needed to set up, deploy, control, and monitor drone flight missions. The results obtained at the end of the study revealed that the drone connected to the chatbot was able to initiate flight missions with just text and voice commands, enable conversation and give real-time feedback from data and requests made to the chatbot. The results further revealed that the system was able to process natural language and produced human-like conversational abilities using Artificial Intelligence (Natural Language Understanding). It is recommended that radio signal adapters be used instead of wireless connections thus to increase the range of communication with the aerial vehicle.

Keywords: artificial ntelligence, chatbot, natural language processing, unmanned aerial vehicle

Procedia PDF Downloads 117
12314 A Study from Language and Culture Perspective of Human Needs in Chinese and Vietnamese Euphemism Languages

Authors: Quoc Hung Le Pham

Abstract:

Human beings are motivated to satisfy the physiological needs and psychological needs. In the fundamental needs, bodily excretion is the most basic one, while physiological excretion refers to the final products produced in the process of discharging the body. This physiological process is a common human phenomenon. For instance, bodily secretion is totally natural, but people of various nationalities through the times avoid saying it directly. Terms like ‘shit’ are often negatively regarded as dirty, smelly and vulgar; it will lead people to negative thinking. In fact, it is in the psychology of human beings to avoid such unsightly terms. Especially in social situations where you have to take care of your image, and you have to release. The best way to solve this is to approach the use of euphemism. People prefer to say it as ‘answering nature's call’ or ‘to pass a motion’ instead. Chinese and Vietnamese nations are referring to use euphemisms to replace bodily secretions, so this research will take this phenomenon as the object aims to explore the similarities and dissimilarities between two languages euphemism. The basic of the niche of this paper is human physiological phenomenon excretion. As the preliminary results show, in expressing bodily secretions the deeply impacting factor is language and cultural factors. On language factor terms, two languages are using assonance to replace human nature discharge, whilst the dissimilarities are metonymy, loan word and personification. On culture factor terms, the convergences are metonymy and application of the semantically-contrary-word-euphemism, whilst the difference is Chinese euphemism using allusion but Vietnamese euphemism does not.

Keywords: cultural factors, euphemism, human needs, language factors

Procedia PDF Downloads 263
12313 Corpus-Based Neural Machine Translation: Empirical Study Multilingual Corpus for Machine Translation of Opaque Idioms - Cloud AutoML Platform

Authors: Khadija Refouh

Abstract:

Culture bound-expressions have been a bottleneck for Natural Language Processing (NLP) and comprehension, especially in the case of machine translation (MT). In the last decade, the field of machine translation has greatly advanced. Neural machine translation NMT has recently achieved considerable development in the quality of translation that outperformed previous traditional translation systems in many language pairs. Neural machine translation NMT is an Artificial Intelligence AI and deep neural networks applied to language processing. Despite this development, there remain some serious challenges that face neural machine translation NMT when translating culture bounded-expressions, especially for low resources language pairs such as Arabic-English and Arabic-French, which is not the case with well-established language pairs such as English-French. Machine translation of opaque idioms from English into French are likely to be more accurate than translating them from English into Arabic. For example, Google Translate Application translated the sentence “What a bad weather! It runs cats and dogs.” to “يا له من طقس سيء! تمطر القطط والكلاب” into the target language Arabic which is an inaccurate literal translation. The translation of the same sentence into the target language French was “Quel mauvais temps! Il pleut des cordes.” where Google Translate Application used the accurate French corresponding idioms. This paper aims to perform NMT experiments towards better translation of opaque idioms using high quality clean multilingual corpus. This Corpus will be collected analytically from human generated idiom translation. AutoML translation, a Google Neural Machine Translation Platform, is used as a custom translation model to improve the translation of opaque idioms. The automatic evaluation of the custom model will be compared to the Google NMT using Bilingual Evaluation Understudy Score BLEU. BLEU is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Human evaluation is integrated to test the reliability of the Blue Score. The researcher will examine syntactical, lexical, and semantic features using Halliday's functional theory.

Keywords: multilingual corpora, natural language processing (NLP), neural machine translation (NMT), opaque idioms

Procedia PDF Downloads 107
12312 Investigating the Potential of VR in Language Education: A Study of Cybersickness and Presence Metrics

Authors: Sakib Hasn, Shahid Anwar

Abstract:

This study highlights the vital importance of assessing the Simulator Sickness Questionnaire and presence measures as virtual reality (VR) incorporation into language teaching gains popularity. To address user discomfort, which prevents efficient learning in VR environments, the measurement of SSQ becomes crucial. Additionally, evaluating presence metrics is essential to determine the level of engagement and immersion, both crucial for rich language learning experiences. This paper designs a VR-based Chinese language application and proposes a thorough test technique aimed at systematically analyzing SSQ and presence measures. Subjective tests and data analysis were carried out to highlight the significance of addressing user discomfort in VR language education. The results of this study shed light on the difficulties posed by user discomfort in VR language learning and offer insightful advice on how to improve VR language learning applications. Furthermore, the outcome of the research explores ‘VR-based language education,’ ‘inclusive language learning platforms," and "cross-cultural communication,’ highlighting the potential for VR to facilitate language learning across diverse cultural backgrounds. Overall, the analysis results contribute to the enrichment of language learning experiences in the virtual realm and underscore the need for continued exploration and improvement in this field.

Keywords: virtual reality (VR), language education, simulator sickness questionnaire, presence metrics, VR-based Chinese language education

Procedia PDF Downloads 30
12311 Predicting Personality and Psychological Distress Using Natural Language Processing

Authors: Jihee Jang, Seowon Yoon, Gaeun Son, Minjung Kang, Joon Yeon Choeh, Kee-Hong Choi

Abstract:

Background: Self-report multiple choice questionnaires have been widely utilized to quantitatively measure one’s personality and psychological constructs. Despite several strengths (e.g., brevity and utility), self-report multiple-choice questionnaires have considerable limitations in nature. With the rise of machine learning (ML) and Natural language processing (NLP), researchers in the field of psychology are widely adopting NLP to assess psychological constructs to predict human behaviors. However, there is a lack of connections between the work being performed in computer science and that psychology due to small data sets and unvalidated modeling practices. Aims: The current article introduces the study method and procedure of phase II, which includes the interview questions for the five-factor model (FFM) of personality developed in phase I. This study aims to develop the interview (semi-structured) and open-ended questions for the FFM-based personality assessments, specifically designed with experts in the field of clinical and personality psychology (phase 1), and to collect the personality-related text data using the interview questions and self-report measures on personality and psychological distress (phase 2). The purpose of the study includes examining the relationship between natural language data obtained from the interview questions, measuring the FFM personality constructs, and psychological distress to demonstrate the validity of the natural language-based personality prediction. Methods: The phase I (pilot) study was conducted on fifty-nine native Korean adults to acquire the personality-related text data from the interview (semi-structured) and open-ended questions based on the FFM of personality. The interview questions were revised and finalized with the feedback from the external expert committee, consisting of personality and clinical psychologists. Based on the established interview questions, a total of 425 Korean adults were recruited using a convenience sampling method via an online survey. The text data collected from interviews were analyzed using natural language processing. The results of the online survey, including demographic data, depression, anxiety, and personality inventories, were analyzed together in the model to predict individuals’ FFM of personality and the level of psychological distress (phase 2).

Keywords: personality prediction, psychological distress prediction, natural language processing, machine learning, the five-factor model of personality

Procedia PDF Downloads 54
12310 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 69
12309 (Re)Calibrating Language Capital among Malay Youths in Singapore

Authors: Mukhlis Abu Bakar

Abstract:

Certain languages are held in higher regard than others given their respective socio-economic and political value, perceived or real. The different positioning of languages manifests in a state’s language-in-education policy, such as Singapore’s which places a premium on English in relation to the mother tongue (MT) languages (Mandarin Chinese, Malay, and Tamil). Among the latter, Mandarin Chinese, as the language of the majority ethnic group, has a more privileged status. The relative positioning of the four official languages shapes Singaporeans’ attitude towards their bilingualism. This paper offers an overview of the attitudes towards English-Malay (EM) bilingualism among Malay youths in Singapore, those who are in school and those already working. It examines how 200 respondents perceive the benefits of their EM bilingualism and their EM bilingual identity. The sample is stratified along gender, socio-economic status, dominant home language and self-rated language proficiency. The online survey comprises questions on the cognitive, communicative, pragmatic and religious benefits of bilingualism, and on language identity. The paper highlight significant trends relating to respondents' positive attitudes towards their EM bilingualism and their bilingual identity. Positive ratings are lowest among young working adults. EM bilinguals also perceive their bilingualism as less useful than English-Chinese bilingualism. These findings are framed within Bourdieu’s metaphor of field and habitus in order to understand why Malay youths make their language choices and why they recalibrate their linguistic capital upon entering the workforce, and in so doing understand the impact a state’s language-in-education policy has on its citizens’ attitude towards their respective English-MT bilingualism.

Keywords: English-Malay bilingualism, language attitude, language identity, recalibrating capital

Procedia PDF Downloads 122
12308 Theorising Chinese as a Foreign Language Curriculum Justice in the Australian School Context

Authors: Wen Xu

Abstract:

The expansion of Confucius institutes and Chinese as a Foreign Language (CFL) education is often considered as cultural invasion and part of much bigger, if not ambitious, Chinese central government agenda among Western public opinion. The CFL knowledge and teaching practice inherent in textbooks are also harshly critiqued as failing to align with Western educational principles. This paper takes up these concerns and attempts to articulate that Confucius’s idea of ‘education without discrimination’ appears to have become synonymous with social justice touted in contemporary Australian education and policy discourses. To do so, it capitalises on Bernstein's conceptualization of classification and pedagogic rights to articulate CFL curriculum's potential of drawing in and drawing out curriculum boundaries to achieve educational justice. In this way, the potential useful knowledge of CFL constitutes a worthwhile tool to engage in a peripheral Western country’s education issues, as well as to include disenfranchised students in the multicultural Australian society. It opens spaces for critically theorising CFL curricular justice in Australian educational contexts, and makes an original contribution to scholarly argumentation that CFL curriculum has the potential of including socially and economically disenfranchised students in schooling.

Keywords: curriculum justice, Chinese as a Foreign Language curriculum, Bernstein, equity

Procedia PDF Downloads 113
12307 Valence and Arousal-Based Sentiment Analysis: A Comparative Study

Authors: Usama Shahid, Muhammad Zunnurain Hussain

Abstract:

This research paper presents a comprehensive analysis of a sentiment analysis approach that employs valence and arousal as its foundational pillars, in comparison to traditional techniques. Sentiment analysis is an indispensable task in natural language processing that involves the extraction of opinions and emotions from textual data. The valence and arousal dimensions, representing the intensity and positivity/negativity of emotions, respectively, enable the creation of four quadrants, each representing a specific emotional state. The study seeks to determine the impact of utilizing these quadrants to identify distinct emotional states on the accuracy and efficiency of sentiment analysis, in comparison to traditional techniques. The results reveal that the valence and arousal-based approach outperforms other approaches, particularly in identifying nuanced emotions that may be missed by conventional methods. The study's findings are crucial for applications such as social media monitoring and market research, where the accurate classification of emotions and opinions is paramount. Overall, this research highlights the potential of using valence and arousal as a framework for sentiment analysis and offers invaluable insights into the benefits of incorporating specific types of emotions into the analysis. These findings have significant implications for researchers and practitioners in the field of natural language processing, as they provide a basis for the development of more accurate and effective sentiment analysis tools.

Keywords: sentiment analysis, valence and arousal, emotional states, natural language processing, machine learning, text analysis, sentiment classification, opinion mining

Procedia PDF Downloads 55
12306 Networking Approach for Historic Urban Landscape: Case Study of the Porcelain Capital of China

Authors: Ding He, Ping Hu

Abstract:

This article presents a “networking approach” as an alternative to the “layering model” in the issue of the historic urban landscape [HUL], based on research conducted in the historic city of Jingdezhen, the center of the porcelain industry in China. This study points out that the existing HUL concept, which can be traced back to the fundamental conceptual divisions set forth by western science, tends to analyze the various elements of urban heritage (composed of hybrid natural-cultural elements) by layers and ignore the nuanced connections and interweaving structure of various elements. Instead, the networking analysis approach can respond to the challenges of complex heritage networks and to the difficulties that are often faced when modern schemes of looking and thinking of landscape in the Eurocentric heritage model encounters local knowledge of Chinese settlement. The fieldwork in this paper examines the local language regarding place names and everyday uses of urban spaces, thereby highlighting heritage systems grounded in local life and indigenous knowledge. In the context of Chinese “Fengshui”, this paper demonstrates the local knowledge of nature and local intelligence of settlement location and design. This paper suggests that industrial elements (kilns, molding rooms, piers, etc.) and spiritual elements (temples for ceramic saints or water gods) are located in their intimate natural networks. Furthermore, the functional, spiritual, and natural elements are perceived as a whole and evolve as an interactive system. This paper proposes a local and cognitive approach in heritage, which was initially developed in European Landscape Convention and historic landscape characterization projects, and yet seeks a more tentative and nuanced model based on urban ethnography in a Chinese city.

Keywords: Chinese city, historic urban landscape, heritage conservation, network

Procedia PDF Downloads 114
12305 Learning Grammars for Detection of Disaster-Related Micro Events

Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev

Abstract:

Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.

Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter

Procedia PDF Downloads 455
12304 The Meaning System of Tense: A Systemic Functional Approach

Authors: Cunyu Zhang

Abstract:

Through literature review about studies related to tense, it is found that there exist disagreements on the definition and existence of Chinese tense. Influenced by some researches on English language which regard tense as a grammatical category based on the verbal inflections of English, some Chinese researchers claim that there is no tense in Chinese language as there are no verbal inflections involved. Meanwhile, other Chinese researchers hold that Chinese still has tense although its verbs are non-inflectional based on the fact that Chinese lexical expressions can imply temporal meaning. We assume that the reasons for the above disagreements in terms of Chinese tense lie in the fact that all the previous studies prefer to view language “from the below” which means expressions of tense are the core part of these studies. However, there are about 6,000 languages with distinct expressions all over the world. Hence, if the language studies only concentrate on expressions, it must become more difficult to understand the nature of language. By contrast, functions of languages are similar; otherwise, the human beings could not communicate with each other. Therefore, we believe that it is necessary for us to have a theoretical study on Chinese tense within the framework of SFL which holds that language is a system where meaning is the core part while form is just the realization of meaning. In addition, SFL is a general linguistic providing a universal framework for languages all over the world. Therefore, based on Systemic Functional Linguistics, the paper firstly redefines tense as a deictic semantic category for describing the speaker’s temporal location of processes and relevant temporal relations. With reference to this definition, this study explores the meaning system of tense. It is proposed that tense expresses four kinds of meaning, namely interpersonal, experiential, logical and textual meanings. From the interpersonal angle, tense helps to exchange temporal information between the speaker and the listener, and the temporal information refers to the anchoring of a concerned process in the past, present or future by the speaker. From the experiential angle, tense plays a role in the temporal locating of material, mental, relational, existential, behavioral and verbal processes by the speaker. From the logical angle, tense denotes the temporal relations at the two levels of clause and clause complex, and such relations fall into simultaneity, anteriority and posteriority. From the textual angle, tense refers to the temporal relations at the level of text, and the temporal relations in question concern linear serial relations and synchronous serial relations.

Keywords: Chinese, meaning system, Systemic Functional Linguistics, tense

Procedia PDF Downloads 389
12303 Working Memory and Phonological Short-Term Memory in the Acquisition of Academic Formulaic Language

Authors: Zhicheng Han

Abstract:

This study examines the correlation between knowledge of formulaic language, working memory (WM), and phonological short-term memory (PSTM) in Chinese L2 learners of English. This study investigates if WM and PSTM correlate differently to the acquisition of formulaic language, which may be relevant for the discourse around the conceptualization of formulas. Connectionist approaches have lead scholars to argue that formulas are form-meaning connections stored whole, making PSTM significant in the acquisitional process as it pertains to the storage and retrieval of chunk information. Generativist scholars, on the other hand, argued for active participation of interlanguage grammar in the acquisition and use of formulaic language, where formulas are represented in the mind but retain the internal structure built around a lexical core. This would make WM, especially the processing component of WM an important cognitive factor since it plays a role in processing and holding information for further analysis and manipulation. The current study asked L1 Chinese learners of English enrolled in graduate programs in China to complete a preference raking task where they rank their preference for formulas, grammatical non-formulaic expressions, and ungrammatical phrases with and without the lexical core in academic contexts. Participants were asked to rank the options in order of the likeliness of them encountering these phrases in the test sentences within academic contexts. Participants’ syntactic proficiency is controlled with a cloze test and grammar test. Regression analysis found a significant relationship between the processing component of WM and preference of formulaic expressions in the preference ranking task while no significant correlation is found for PSTM or syntactic proficiency. The correlational analysis found that WM, PSTM, and the two proficiency test scores have significant covariates. However, WM and PSTM have different predictor values for participants’ preference for formulaic language. Both storage and processing components of WM are significantly correlated with the preference for formulaic expressions while PSTM is not. These findings are in favor of the role of interlanguage grammar and syntactic knowledge in the acquisition of formulaic expressions. The differing effects of WM and PSTM suggest that selective attention to and processing of the input beyond simple retention play a key role in successfully acquiring formulaic language. Similar correlational patterns were found for preferring the ungrammatical phrase with the lexical core of the formula over the ones without the lexical core, attesting to learners’ awareness of the lexical core around which formulas are constructed. These findings support the view that formulaic phrases retain internal syntactic structures that are recognized and processed by the learners.

Keywords: formulaic language, working memory, phonological short-term memory, academic language

Procedia PDF Downloads 24
12302 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 130
12301 Another Justice: Litigation Masters in Chinese Legal Story

Authors: Lung-Lung Hu

Abstract:

Ronald Dworkin offered a legal theory of ‘chain enterprise’ that all the judges in legal history altogether create a ‘law’ aiming a specific purpose. Those judges are like co-writers of a chain-story who not only create freely but also are constrained by the story made by the judges before them. The law created by Chinese traditional judges is another case, they, compared with the judges mentioned by Ronald Dworkin, have relatively narrower space of making a legal sentence according to their own discretions because the statutes in Chinese traditional law at the very beginning have been designed as panel code that leaves small room to judge’s discretion. Furthermore, because law is a representative of the authority of the government, i.e. the emperor, any misjudges and misuses deviated from the law will be considered as a challenge to the supreme power. However, different from judges as the defenders of law, Chinese litigation masters who want to win legal cases have to be offenders challenging the verdict that does not favor his or his client’s interest. Besides, litigation master as an illegal or non-authorized profession does not belong to any legal system, therefore, they are relatively freer to ‘create’ the law. According to Stanley Fish’s articles that question Ronald Dworkin and Owen Fiss’ ideas about law, he construes that, since law is made of language, law is open to interpretations that cannot be constrained by any rules or any particular legal purposes. Stanley Fish’s idea can also be applied on the analysis about the stories of Chinese litigation masters in traditional Chinese literature. These Chinese litigation masters’ legal opinions in the so-called chain enterprise are like an unexpected episode that tries to revise the fixed story told by law. Although they are not welcome to the officials and also to the society, their existence is still a phenomenon representing another version of justice different from the official’s and can be seen as a de-structural power to the government. Hence, in this present paper the language and strategy applied by Chinese litigation masters in Chinese legal stories will be analysed to see how they refute made legal judgments and challenge the official standard of justice.

Keywords: Chinese legal stories, interdisciplinary, litigation master, post-structuralism

Procedia PDF Downloads 358
12300 Identifying and Understand Pragmatic Failures in Portuguese Foreign Language by Chinese Learners in Macau

Authors: Carla Lopes

Abstract:

It is clear nowadays that the proper performance of different speech acts is one of the most difficult obstacles that a foreign language learner has to overcome to be considered communicatively competent. This communication presents the results of an investigation on the pragmatic performance of Portuguese Language students at the University of Macau. The research discussed herein is based on a survey consisting of fourteen speaking situations to which the participants must respond in writing, and that includes different types of speech acts: apology, response to a compliment, refusal, complaint, disagreement and the understanding of the illocutionary force of indirect speech acts. The responses were classified in a five levels Likert scale (quantified from 1 to 5) according to their suitability for the particular situation. In general terms, we can summarize that about 45% of the respondents' answers were pragmatically competent, 10 % were acceptable and 45 % showed weaknesses at socio-pragmatic competence level. Given that the linguistic deviations were not taken into account, we can conclude that the faults are of cultural origin. It is natural that in the presence of orthogonal cultures, such as Chinese and Portuguese, there are failures of this type, barely solved in the four years of the undergraduate program. The target population, native speakers of Cantonese or Mandarin, make their first contact with the English language before joining the Bachelor of Portuguese Language. An analysis of the socio - pragmatic failures in the respondents’ answers suggests the conclusion that many of them are due to the lack of cultural knowledge. They try to compensate for this either using their native culture or resorting to a Western culture that they consider close to the Portuguese, that is the English or US culture, previously studied, and also widely present in the media and on the internet. This phenomenon, known as 'pragmatic transfer', can result in a linguistic behavior that may be considered inauthentic or pragmatically awkward. The resulting speech act is grammatically correct but is not pragmatically feasible, since it is not suitable to the culture of the target language, either because it does not exist or because the conditions of its use are in fact different. Analysis of the responses also supports the conclusion that these students present large deviations from the expected and stereotyped behavior of Chinese students. We can speculate while this linguistic behavior is the consequence of the Macao globalization that culturally casts the students, makes them more open, and distinguishes them from the typical Chinese students.

Keywords: Portuguese foreign language, pragmatic failures, pragmatic transfer, pragmatic competence

Procedia PDF Downloads 189
12299 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers

Authors: Yogendra Sisodia

Abstract:

Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.

Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity

Procedia PDF Downloads 68
12298 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 287
12297 Digital Development of Cultural Heritage: Construction of Traditional Chinese Pattern Database

Authors: Shaojian Li

Abstract:

The traditional Chinese patterns, as an integral part of Chinese culture, possess unique values in history, culture, and art. However, with the passage of time and societal changes, many of these traditional patterns are at risk of being lost, damaged, or forgotten. To undertake the digital preservation and protection of these traditional patterns, this paper will collect and organize images of traditional Chinese patterns. It will provide exhaustive and comprehensive semantic annotations, creating a resource library of traditional Chinese pattern images. This will support the digital preservation and application of traditional Chinese patterns.

Keywords: digitization of cultural heritage, traditional Chinese patterns, digital humanities, database construction

Procedia PDF Downloads 25
12296 Effects of Global Validity of Predictive Cues upon L2 Discourse Comprehension: Evidence from Self-paced Reading

Authors: Binger Lu

Abstract:

It remains unclear whether second language (L2) speakers could use discourse context cues to predict upcoming information as native speakers do during online comprehension. Some researchers propose that L2 learners may have a reduced ability to generate predictions during discourse processing. At the same time, there is evidence that discourse-level cues are weighed more heavily in L2 processing than in L1. Previous studies showed that L1 prediction is sensitive to the global validity of predictive cues. The current study aims to explore whether and to what extent L2 learners can dynamically and strategically adjust their prediction in accord with the global validity of predictive cues in L2 discourse comprehension as native speakers do. In a self-paced reading experiment, Chinese native speakers (N=128), C-E bilinguals (N=128), and English native speakers (N=128) read high-predictable (e.g., Jimmy felt thirsty after running. He wanted to get some water from the refrigerator.) and low-predictable (e.g., Jimmy felt sick this morning. He wanted to get some water from the refrigerator.) discourses in two-sentence frames. The global validity of predictive cues was manipulated by varying the ratio of predictable (e.g., Bill stood at the door. He opened it with the key.) and unpredictable fillers (e.g., Bill stood at the door. He opened it with the card.), such that across conditions, the predictability of the final word of the fillers ranged from 100% to 0%. The dependent variable was reading time on the critical region (the target word and the following word), analyzed with linear mixed-effects models in R. C-E bilinguals showed reliable prediction across all validity conditions (β = -35.6 ms, SE = 7.74, t = -4.601, p< .001), and Chinese native speakers showed significant effect (β = -93.5 ms, SE = 7.82, t = -11.956, p< .001) in two of the four validity conditions (namely, the High-validity and MedLow conditions, where fillers ended with predictable words in 100% and 25% cases respectively), whereas English native speakers didn’t predict at all (β = -2.78 ms, SE = 7.60, t = -.365, p = .715). There was neither main effect (χ^²(3) = .256, p = .968) nor interaction (Predictability: Background: Validity, χ^²(3) = 1.229, p = .746; Predictability: Validity, χ^²(3) = 2.520, p = .472; Background: Validity, χ^²(3) = 1.281, p = .734) of Validity with speaker groups. The results suggest that prediction occurs in L2 discourse processing but to a much less extent in L1, witha significant effect in some conditions of L1 Chinese and anull effect in L1 English processing, consistent with the view that L2 speakers are more sensitive to discourse cues compared with L1 speakers. Additionally, the pattern of L1 and L2 predictive processing was not affected by the global validity of predictive cues. C-E bilinguals’ predictive processing could be partly transferred from their L1, as prior research showed that discourse information played a more significant role in L1 Chinese processing.

Keywords: bilingualism, discourse processing, global validity, prediction, self-paced reading

Procedia PDF Downloads 110
12295 Changes of First-Person Pronoun Pragmatic Functions in Three Historical Chinese Texts

Authors: Cher Leng Lee

Abstract:

The existence of multiple first-person pronouns (1PPs) in classical Chinese is an issue that has not been resolved despite linguists using the grammatical perspective. This paper proposes pragmatics as a viable solution. There is also a lack of research exploring the evolving usage patterns of 1PPs within the historical context of Chinese language use. Such research can help us comprehend the changes and developments of these linguistic elements. To fill these research gaps, we use the diachronic pragmatics approach to contrast the functions of Chinese 1PPs in three representative texts from three different historical periods: The Analects (The Spring and Autumn Period), The Grand Scribe’s Records (Grand Records) (Qin and Han Period), and A New Account of Tales of the World (New Account) (The Wei, Jin and Southern and Northern Period). The 1PPs of these texts are manually identified and classified according to the pragmatic functions in the given contexts to observe their historical changes, understand the factors that contribute to these changes, and provide possible answers to the development of how wo became the only 1PP in today’s spoken Mandarin.

Keywords: historical, Chinese, pronouns, pragmatics

Procedia PDF Downloads 21
12294 The Sapir-Whorf Hypothesis and Multicultural Effects on Translators: A Case Study from Chinese Ethnic Minority Literature

Authors: Yuqiao Zhou

Abstract:

The Sapir-Whorf hypothesis (SWH) emphasizes the effect produced by language on people’s minds. According to linguistic relativity, language has evolved over the course of human life on earth, and, in turn, the acquisition of language shapes learners’ thoughts. Despite much attention drawn by SWH, few scholars have attempted to analyse people’s thoughts via their literary works. And yet, the linguistic choices that create a narrative can enable us to examine its writer’s thoughts. Still, less work has been done on the impact of language on the minds of bilingual people. Internationalization has resulted in an increasing number of bilingual and multilingual individuals. In China, where more than one hundred languages are used for communication, most people are bilingual in Mandarin Chinese (the official language of China) and their own dialect. Taking as its corpus the ethnic minority myth of Ge Sa-er Wang by Alai and its English translation by Goldblatt and Lin, this paper aims to analyse the effects of culture on bilingual people’s minds. It will first analyse Alai’s thoughts on using the original version of Ge Sa-er Wang; next, it will examine the thoughts of the two translators by looking at translation choices made in the English version; finally, it will compare the cultural influences evident in the thoughts of Alai, and Goldblatt and Lin. Whereas Alai can speak two Sino-Tibetan languages – Mandarin Chinese and Tibetan – Goldblatt and Lin can speak two languages from different families – Mandarin Chinese (a Sino-Tibetan language) and English (an Indo-European language). The results reveal two systems of thought existing in the translators’ minds; Alai’s text, on the other hand, does not reveal a significant influence from North China, where Mandarin Chinese originated. The findings reveal the inconsistency of a second language’s influence on people’s minds. Notably, they suggest that the more different the two languages are, the greater the influence produced by the second language culture on people’s thoughts. It is hoped that this research will expand the scope of SWH as well as shed light on future translation studies on ethnic minority literature.

Keywords: Sapir-Whorf hypothesis, cultural translation, cultural-specific items, Ge Sa-er Wang, ethnic minority literature, Tibet

Procedia PDF Downloads 60
12293 Linguistic Symbols Principle Construction in Cultural Creative Product Design

Authors: Pei-Jun Xue, Ming-Yu Hsiao

Abstract:

Language is the emblem of a culture, representing the extension of cultural life. In addition, it is also an important tool for communication and message transmission. It carries not only information but also covers the self-conscious of the information constructor as well as the situational experiences of users from different backgrounds. Moreover, design can be regarded as a language, a dynamic process of coding and decoding. With the designers’ experiences in everyday life, they bring them into the products’ experiences. Considered from the aspects of atmosphere and the five senses, a designer should consider and reconsider how to communicate the messages effectively to suit the users’ needs. In the process of language learning, we should understand the construction behind it and the rules of the compositions of language codes. Regarding the understanding of the design of works or the form of product construction, it is necessary for us to understand the coding system during the process of product construction. The form (signifiers) and meanings (signified) of Chinese characters are closely related. At the same time, it is also a process of simplifying the complicated to the simple. This study discusses the chinese characters that used in the cultural symbols construction, and analysis of existing products by Peirce's semiotic triangles. Through people's cognition of Chinese characters and constitute method, help to understand the way of construction product symbol.

Keywords: cultural-creative product design, cultural product, cultural symbols, linguistic symbols

Procedia PDF Downloads 423
12292 The Output Fallacy: An Investigation into Input, Noticing, and Learners’ Mechanisms

Authors: Samantha Rix

Abstract:

The purpose of this research paper is to investigate the cognitive processing of learners who receive input but produce very little or no output, and who, when they do produce output, exhibit a similar language proficiency as do those learners who produced output more regularly in the language classroom. Previous studies have investigated the benefits of output (with somewhat differing results); therefore, the presentation will begin with an investigation of what may underlie gains in proficiency without output. Consequently, a pilot study was designed and conducted to gain insight into the cognitive processing of low-output language learners looking, for example, at quantity and quality of noticing. This will be carried out within the paradigm of action classroom research, observing and interviewing low-output language learners in an intensive English program at a small Midwest university. The results of the pilot study indicated that autonomy in language learning, specifically utilizing strategies such self-monitoring, self-talk, and thinking 'out-loud', were crucial in the development of language proficiency for academic-level performance. The presentation concludes with an examination of pedagogical implication for classroom use in order to aide students in their language development.

Keywords: cognitive processing, language learners, language proficiency, learning strategies

Procedia PDF Downloads 441
12291 Composite Kernels for Public Emotion Recognition from Twitter

Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang

Abstract:

The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.

Keywords: emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining

Procedia PDF Downloads 193
12290 A Corpus-Based Contrastive Analysis of Directive Speech Act Verbs in English and Chinese Legal Texts

Authors: Wujian Han

Abstract:

In the process of human interaction and communication, speech act verbs are considered to be the most active component and the main means for information transmission, and are also taken as an indication of the structure of linguistic behavior. The theoretical value and practical significance of such everyday built-in metalanguage have long been recognized. This paper, which is part of a bigger study, is aimed to provide useful insights for a more precise and systematic application to speech act verbs translation between English and Chinese, especially with regard to the degree to which generic integrity is maintained in the practice of translation of legal documents. In this study, the corpus, i.e. Chinese legal texts and their English translations, English legal texts, ordinary Chinese texts, and ordinary English texts, serve as a testing ground for examining contrastively the usage of English and Chinese directive speech act verbs in legal genre. The scope of this paper is relatively wide and essentially covers all directive speech act verbs which are used in ordinary English and Chinese, such as order, command, request, prohibit, threat, advice, warn and permit. The researcher, by combining the corpus methodology with a contrastive perspective, explored a range of characteristics of English and Chinese directive speech act verbs including their semantic, syntactic and pragmatic features, and then contrasted them in a structured way. It has been found that there are similarities between English and Chinese directive speech act verbs in legal genre, such as similar semantic components between English speech act verbs and their translation equivalents in Chinese, formal and accurate usage of English and Chinese directive speech act verbs in legal contexts. But notable differences have been identified in areas of difference between their usage in the original Chinese and English legal texts such as valency patterns and frequency of occurrences. For example, the subjects of some directive speech act verbs are very frequently omitted in Chinese legal texts, but this is not the case in English legal texts. One of the practicable methods to achieve adequacy and conciseness in speech act verb translation from Chinese into English in legal genre is to repeat the subjects or the message with discrepancy, and vice versa. In addition, translation effects such as overuse and underuse of certain directive speech act verbs are also found in the translated English texts compared to the original English texts. Legal texts constitute a particularly valuable material for speech act verb study. Building up such a contrastive picture of the Chinese and English speech act verbs in legal language would yield results of value and interest to legal translators and students of language for legal purposes and have practical application to legal translation between English and Chinese.

Keywords: contrastive analysis, corpus-based, directive speech act verbs, legal texts, translation between English and Chinese

Procedia PDF Downloads 447
12289 A Comparative Analysis of Hyper-Parameters Using Neural Networks for E-Mail Spam Detection

Authors: Syed Mahbubuz Zaman, A. B. M. Abrar Haque, Mehedi Hassan Nayeem, Misbah Uddin Sagor

Abstract:

Everyday e-mails are being used by millions of people as an effective form of communication over the Internet. Although e-mails allow high-speed communication, there is a constant threat known as spam. Spam e-mail is often called junk e-mails which are unsolicited and sent in bulk. These unsolicited emails cause security concerns among internet users because they are being exposed to inappropriate content. There is no guaranteed way to stop spammers who use static filters as they are bypassed very easily. In this paper, a smart system is proposed that will be using neural networks to approach spam in a different way, and meanwhile, this will also detect the most relevant features that will help to design the spam filter. Also, a comparison of different parameters for different neural network models has been shown to determine which model works best within suitable parameters.

Keywords: long short-term memory, bidirectional long short-term memory, gated recurrent unit, natural language processing, natural language processing

Procedia PDF Downloads 178