Search results for: conversational corpora
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 167

Search results for: conversational corpora

107 A Comparative Study of Motion Events Encoding in English and Italian

Authors: Alfonsina Buoniconto

Abstract:

The aim of this study is to investigate the degree of cross-linguistic and intra-linguistic variation in the encoding of motion events (MEs) in English and Italian, these being typologically different languages both showing signs of disobedience to their respective types. As a matter of fact, the traditional typological classification of MEs encoding distributes languages into two macro-types, based on the preferred locus for the expression of Path, the main ME component (other components being Figure, Ground and Manner) characterized by conceptual and structural prominence. According to this model, Satellite-framed (SF) languages typically express Path information in verb-dependent items called satellites (e.g. preverbs and verb particles) with main verbs encoding Manner of motion; whereas Verb-framed languages (VF) tend to include Path information within the verbal locus, leaving Manner to adjuncts. Although this dichotomy is valid altogether, languages do not always behave according to their typical classification patterns. English, for example, is usually ascribed to the SF type due to the rich inventory of postverbal particles and phrasal verbs used to express spatial relations (i.e. the cat climbed down the tree); nevertheless, it is not uncommon to find constructions such as the fog descended slowly, which is typical of the VF type. Conversely, Italian is usually described as being VF (cf. Paolo uscì di corsa ‘Paolo went out running’), yet SF constructions like corse via in lacrime ‘She ran away in tears’ are also frequent. This paper will try to demonstrate that such a typological overlapping is due to the fact that the semantic units making up MEs are distributed within several loci of the sentence –not only verbs and satellites– thus determining a number of different constructions stemming from convergent factors. Indeed, the linguistic expression of motion events depends not only on the typological nature of languages in a traditional sense, but also on a series morphological, lexical, and syntactic resources, as well as on inferential, discursive, usage-related, and cultural factors that make semantic information more or less accessible, frequent, and easy to process. Hence, rather than describe English and Italian in dichotomic terms, this study focuses on the investigation of cross-linguistic and intra-linguistic variation in the use of all the strategies made available by each linguistic system to express motion. Evidence for these assumptions is provided by parallel corpora analysis. The sample texts are taken from two contemporary Italian novels and their respective English translations. The 400 motion occurrences selected (200 in English and 200 in Italian) were scanned according to the MODEG (an acronym for Motion Decoding Grid) methodology, which grants data comparability through the indexation and retrieval of combined morphosyntactic and semantic information at different levels of detail.

Keywords: construction typology, motion event encoding, parallel corpora, satellite-framed vs. verb-framed type

Procedia PDF Downloads 233
106 Information-Controlled Laryngeal Feature Variations in Korean Consonants

Authors: Ponghyung Lee

Abstract:

This study seeks to investigate the variations occurring to Korean consonantal variations center around laryngeal features of the concerned sounds, to the exclusion of others. Our fundamental premise is that the weak contrast associated with concerned segments might be held accountable for the oscillation of the status quo of the concerned consonants. What is more, we assume that an array of notions as a measure of communicative efficiency of linguistic units would be significantly influential on triggering those variations. To this end, we have tried to compute the surprisal, entropic contribution, and relative contrastiveness associated with Korean obstruent consonants. What we found therein is that the Information-theoretic perspective is compelling enough to lend support our approach to a considerable extent. That is, the variant realizations, chronologically and stylistically, prove to be profoundly affected by a set of Information-theoretic factors enumerated above. When it comes to the biblical proper names, we use Georgetown University CQP Web-Bible corpora. From the 8 texts (4 from Old Testament and 4 from New Testament) among the total 64 texts, we extracted 199 samples. We address the issue of laryngeal feature variations associated with Korean obstruent consonants under the presumption that the variations stem from the weak contrast among the triad manifestations of laryngeal features. The variants emerge from diverse sources in chronological and stylistic senses: Christianity biblical texts, ordinary casual speech, the shift of loanword adaptation over time, and ideophones. For the purpose of discussing what they are really like from the perspective of Information Theory, it is necessary to closely look at the data. Among them, the massive changes occurring to loanword adaptation of proper nouns during the centennial history of Korean Christianity draw our special attention. We searched 199 types of initially capitalized words among 45,528-word tokens, which account for around 5% of total 901,701-word tokens (12,786-word types) from Georgetown University CQP Web-Bible corpora. We focus on the shift of the laryngeal features incorporated into word-initial consonants, which are available through the two distinct versions of Korean Bible: one came out in the 1960s for the Protestants, and the other was published in the 1990s for the Catholic Church. Of these proper names, we have closely traced the adaptation of plain obstruents, e. g. /b, d, g, s, ʤ/ in the sources. The results show that as much as 41% of the extracted proper names show variations; 37% in terms of aspiration, and 4% in terms of tensing. This study set out in an effort to shed light on the question: to what extent can we attribute the variations occurring to the laryngeal features associated with Korean obstruent consonants to the communicative aspects of linguistic activities? In this vein, the concerted effects of the triad, of surprisal, entropic contribution, and relative contrastiveness can be credited with the ups and downs in the feature specification, despite being contentiousness on the role of surprisal to some extent.

Keywords: entropic contribution, laryngeal feature variation, relative contrastiveness, surprisal

Procedia PDF Downloads 100
105 Interacting with Multi-Scale Structures of Online Political Debates by Visualizing Phylomemies

Authors: Quentin Lobbe, David Chavalarias, Alexandre Delanoe

Abstract:

The ICT revolution has given birth to an unprecedented world of digital traces and has impacted a wide number of knowledge-driven domains such as science, education or policy making. Nowadays, we are daily fueled by unlimited flows of articles, blogs, messages, tweets, etc. The internet itself can thus be considered as an unsteady hyper-textual environment where websites emerge and expand every day. But there are structures inside knowledge. A given text can always be studied in relation to others or in light of a specific socio-cultural context. By way of their textual traces, human beings are calling each other out: hypertext citations, retweets, vocabulary similarity, etc. We are in fact the architects of a giant web of elements of knowledge whose structures and shapes convey their own information. The global shapes of these digital traces represent a source of collective knowledge and the question of their visualization remains an opened challenge. How can we explore, browse and interact with such shapes? In order to navigate across these growing constellations of words and texts, interdisciplinary innovations are emerging at the crossroad between fields of social and computational sciences. In particular, complex systems approaches make it now possible to reconstruct the hidden structures of textual knowledge by means of multi-scale objects of research such as semantic maps and phylomemies. The phylomemy reconstruction is a generic method related to the co-word analysis framework. Phylomemies aim to reveal the temporal dynamics of large corpora of textual contents by performing inter-temporal matching on extracted knowledge domains in order to identify their conceptual lineages. This study aims to address the question of visualizing the global shapes of online political discussions related to the French presidential and legislative elections of 2017. We aim to build phylomemies on top of a dedicated collection of thousands of French political tweets enriched with archived contemporary news web articles. Our goal is to reconstruct the temporal evolution of online debates fueled by each political community during the elections. To that end, we want to introduce an iterative data exploration methodology implemented and tested within the free software Gargantext. There we combine synchronic and diachronic axis of visualization to reveal the dynamics of our corpora of tweets and web pages as well as their inner syntagmatic and paradigmatic relationships. In doing so, we aim to provide researchers with innovative methodological means to explore online semantic landscapes in a collaborative and reflective way.

Keywords: online political debate, French election, hyper-text, phylomemy

Procedia PDF Downloads 159
104 How to “Eat” without Actually Eating: Marking Metaphor with Spanish Se and Italian Si

Authors: Cinzia Russi, Chiyo Nishida

Abstract:

Using data from online corpora (Spanish CREA, Italian CORIS), this paper examines the relatively understudied use of Spanish se and Italian si exemplified in (1) and (2), respectively. (1) El rojo es … el que se come a los demás. ‘The red (bottle) is the one that outshines/*eats the rest.’(2) … ebbe anche la saggezza di mangiarsi tutto il suo patrimonio. ‘… he even had the wisdom to squander/*eat all his estate.’ In these sentences, se/si accompanies the consumption verb comer/mangiare ‘to eat’, without which the sentences would not be interpreted appropriately. This se/si cannot readily be attributed to any of the multiple functions so far identified in the literature: reflexive, ergative, middle/passive, inherent, benefactive, and complete consumptive. In particular, this paper argues against the feasibility of a recent construction-based analysis of sentences like (1) and (2), which situates se/si within a prototype-based network of meanings all deriving from the central meaning of 'COMPLETE CONSUMPTION' (e.g., Alice se comió toda la torta/Alicesi è mangiata tutta la torta ‘John ate the whole cake’). Clearly, the empirical adequacy of such an account is undermined by the fact that the events depicted in the se/si-sentences at issue do not always entail complete consumption because they may lack an INCREMENTAL THEME, the distinguishing property of complete consumption. Alternatively, it is proposed that the sentences under analysis represent instances of verbal METAPHORICAL EXTENSION: se/si represents an explicit marker of this cognitive process, which has independently developed from the complete consumptive se/si, and the meaning extension is captured by the general tenets of Conceptual Metaphor Theory (CMT). Two conceptual domains, Source (DS) and target (DT), are related by similarity, assigning an appropriate metaphorical interpretation to DT. The domains paired here are comer/mangiare (DS) and comerse/mangiarsi (DT). The eating event (DS) involves (a) the physical process of xEATER grinding yFOOD-STUFF into pieces and swallowing it; and (b) the aspect of xEATER savoring yFOOD-STUFF and being nurtured by it. In the physical act of eating, xEATER has dominance and exercises his force over yFOOD-STUFF. This general sense of dominance and force is mapped onto DT and is manifested in the ways exemplified in (1) and (2), and many others. According to CMT, two other properties are observed in each pair of DS & DT. First, DS tends to be more physical and concrete and DT more abstract, and systematic mappings are established between constituent elements in DS and those in DT: xEATER corresponds to the element that destroys and yFOOD-STUFF to the element that is destroyed in DT, as exemplified in (1) and (2). Though the metaphorical extension marker se/si appears by far most frequently with comer/mangiare in the corpora, similar systematic mappings are observed in several other verb pairs, for example, jugar/giocare ‘to play (games)’ and jugarse/giocarsi ‘to jeopardize/risk (life, reputation, etc.)’, perder/perdere ‘to lose (an object)’ and perderse/perdersi ‘to miss out on (an event)’, etc. Thus, this study provides evidence that languages may indeed formally mark metaphor using means available to them.

Keywords: complete consumption value, conceptual metaphor, Italian si/Spanish se, metaphorical extension.

Procedia PDF Downloads 22
103 A Theoretical and Corpus-Based Analysis of English and Spanish Syntax Derived from Método de Los Relojes Verb Types According to Systemic-Functional Grammar as a Foundation for Methodological Adaption

Authors: Timothy William Lawrence

Abstract:

The goal of this paper is to research and categorize the four basic verb types found in the Spanish descriptive grammar book Método de los Relojes using verb clauses as representation as found in M.A.K. Halliday's Systemic-Functional Grammar with the purpose of establishing theoretical along with syntactical parallels and deviations between English and Spanish. Results confirm theoretical correlations exist therefore leading to an analysis of English grammar syntax resulting in delineating commonalities and differences from Spanish. Corpora searches were carried out on different patterns of syntactical structures confirming divergences in verb syntax, making it possible to establish parameters to adapt English verbs to the criteria of the four basic Método de los Relojes verb types.

Keywords: corpus studies, Método de los Relojes, structural-functional grammar, verb syntax

Procedia PDF Downloads 164
102 Large Language Model Powered Chatbots Need End-to-End Benchmarks

Authors: Debarag Banerjee, Pooja Singh, Arjun Avadhanam, Saksham Srivastava

Abstract:

Autonomous conversational agents, i.e., chatbots, are becoming an increasingly common mechanism for enterprises to provide support to customers and partners. In order to rate chatbots, especially ones powered by Generative AI tools like Large Language Models (LLMs), we need to be able to accurately assess their performance. This is where chatbot benchmarking becomes important. In this paper, authors propose the use of a benchmark that they call the E2E (End to End) benchmark and show how the E2E benchmark can be used to evaluate the accuracy and usefulness of the answers provided by chatbots, especially ones powered by LLMs. The authors evaluate an example chatbot at different levels of sophistication based on both our E2E benchmark as well as other available metrics commonly used in the state of the art and observe that the proposed benchmark shows better results compared to others. In addition, while some metrics proved to be unpredictable, the metric associated with the E2E benchmark, which uses cosine similarity, performed well in evaluating chatbots. The performance of our best models shows that there are several benefits of using the cosine similarity score as a metric in the E2E benchmark.

Keywords: chatbot benchmarking, end-to-end (E2E) benchmarking, large language model, user centric evaluation.

Procedia PDF Downloads 41
101 An Investigation on the Perception and Adoption of Terminology Management Applications by the Iranian English Language Translators

Authors: Abdul Amir Hazbavi

Abstract:

In recent years, there have been increasing requests in the field of translation studies to develop software facilitating the analysis of corpora. One of the specialized tools in that regard are Terminology Management Tools. Briefly explaining, Terminology Management Tools are applications developed to help create and store terminological data in the form which allows for a controlled use of the data. While it has a long history and an established ground in translation market in most parts of the globe, the Iranian translators and translation market still seem to be unaware or unfamiliar with Terminology Management Tools. In order to provide a preview on the perception and adoption of Terminology Management Tools by the Iranian translators, the present survey was carried out among 224 last-year undergraduate Iranian students of English translation at 10 different universities across the country. The study revealed a very low level of adoption and a very high level of willingness to get familiar with and learn about Terminology Management Tools by the Iranian translators.

Keywords: translation, translation technology, terminology management tools, terminology management survey

Procedia PDF Downloads 344
100 Resume Ranking Using Custom Word2vec and Rule-Based Natural Language Processing Techniques

Authors: Subodh Chandra Shakya, Rajendra Sapkota, Aakash Tamang, Shushant Pudasaini, Sujan Adhikari, Sajjan Adhikari

Abstract:

Lots of efforts have been made in order to measure the semantic similarity between the text corpora in the documents. Techniques have been evolved to measure the similarity of two documents. One such state-of-art technique in the field of Natural Language Processing (NLP) is word to vector models, which converts the words into their word-embedding and measures the similarity between the vectors. We found this to be quite useful for the task of resume ranking. So, this research paper is the implementation of the word2vec model along with other Natural Language Processing techniques in order to rank the resumes for the particular job description so as to automate the process of hiring. The research paper proposes the system and the findings that were made during the process of building the system.

Keywords: chunking, document similarity, information extraction, natural language processing, word2vec, word embedding

Procedia PDF Downloads 130
99 ExactData Smart Tool For Marketing Analysis

Authors: Aleksandra Jonas, Aleksandra Gronowska, Maciej Ścigacz, Szymon Jadczak

Abstract:

Exact Data is a smart tool which helps with meaningful marketing content creation. It helps marketers achieve this by analyzing the text of an advertisement before and after its publication on social media sites like Facebook or Instagram. In our research we focus on four areas of natural language processing (NLP): grammar correction, sentiment analysis, irony detection and advertisement interpretation. Our research has identified a considerable lack of NLP tools for the Polish language, which specifically aid online marketers. In light of this, our research team has set out to create a robust and versatile NLP tool for the Polish language. The primary objective of our research is to develop a tool that can perform a range of language processing tasks in this language, such as sentiment analysis, text classification, text correction and text interpretation. Our team has been working diligently to create a tool that is accurate, reliable, and adaptable to the specific linguistic features of Polish, and that can provide valuable insights for a wide range of marketers needs. In addition to the Polish language version, we are also developing an English version of the tool, which will enable us to expand the reach and impact of our research to a wider audience. Another area of focus in our research involves tackling the challenge of the limited availability of linguistically diverse corpora for non-English languages, which presents a significant barrier in the development of NLP applications. One approach we have been pursuing is the translation of existing English corpora, which would enable us to use the wealth of linguistic resources available in English for other languages. Furthermore, we are looking into other methods, such as gathering language samples from social media platforms. By analyzing the language used in social media posts, we can collect a wide range of data that reflects the unique linguistic characteristics of specific regions and communities, which can then be used to enhance the accuracy and performance of NLP algorithms for non-English languages. In doing so, we hope to broaden the scope and capabilities of NLP applications. Our research focuses on several key NLP techniques including sentiment analysis, text classification, text interpretation and text correction. To ensure that we can achieve the best possible performance for these techniques, we are evaluating and comparing different approaches and strategies for implementing them. We are exploring a range of different methods, including transformers and convolutional neural networks (CNNs), to determine which ones are most effective for different types of NLP tasks. By analyzing the strengths and weaknesses of each approach, we can identify the most effective techniques for specific use cases, and further enhance the performance of our tool. Our research aims to create a tool, which can provide a comprehensive analysis of advertising effectiveness, allowing marketers to identify areas for improvement and optimize their advertising strategies. The results of this study suggest that a smart tool for advertisement analysis can provide valuable insights for businesses seeking to create effective advertising campaigns.

Keywords: NLP, AI, IT, language, marketing, analysis

Procedia PDF Downloads 55
98 Mondoc: Informal Lightweight Ontology for Faceted Semantic Classification of Hypernymy

Authors: M. Regina Carreira-Lopez

Abstract:

Lightweight ontologies seek to concrete union relationships between a parent node, and a secondary node, also called "child node". This logic relation (L) can be formally defined as a triple ontological relation (LO) equivalent to LO in ⟨LN, LE, LC⟩, and where LN represents a finite set of nodes (N); LE is a set of entities (E), each of which represents a relationship between nodes to form a rooted tree of ⟨LN, LE⟩; and LC is a finite set of concepts (C), encoded in a formal language (FL). Mondoc enables more refined searches on semantic and classified facets for retrieving specialized knowledge about Atlantic migrations, from the Declaration of Independence of the United States of America (1776) and to the end of the Spanish Civil War (1939). The model looks forward to increasing documentary relevance by applying an inverse frequency of co-ocurrent hypernymy phenomena for a concrete dataset of textual corpora, with RMySQL package. Mondoc profiles archival utilities implementing SQL programming code, and allows data export to XML schemas, for achieving semantic and faceted analysis of speech by analyzing keywords in context (KWIC). The methodology applies random and unrestricted sampling techniques with RMySQL to verify the resonance phenomena of inverse documentary relevance between the number of co-occurrences of the same term (t) in more than two documents of a set of texts (D). Secondly, the research also evidences co-associations between (t) and their corresponding synonyms and antonyms (synsets) are also inverse. The results from grouping facets or polysemic words with synsets in more than two textual corpora within their syntagmatic context (nouns, verbs, adjectives, etc.) state how to proceed with semantic indexing of hypernymy phenomena for subject-heading lists and for authority lists for documentary and archival purposes. Mondoc contributes to the development of web directories and seems to achieve a proper and more selective search of e-documents (classification ontology). It can also foster on-line catalogs production for semantic authorities, or concepts, through XML schemas, because its applications could be used for implementing data models, by a prior adaptation of the based-ontology to structured meta-languages, such as OWL, RDF (descriptive ontology). Mondoc serves to the classification of concepts and applies a semantic indexing approach of facets. It enables information retrieval, as well as quantitative and qualitative data interpretation. The model reproduces a triple tuple ⟨LN, LE, LT, LCF L, BKF⟩ where LN is a set of entities that connect with other nodes to concrete a rooted tree in ⟨LN, LE⟩. LT specifies a set of terms, and LCF acts as a finite set of concepts, encoded in a formal language, L. Mondoc only resolves partial problems of linguistic ambiguity (in case of synonymy and antonymy), but neither the pragmatic dimension of natural language nor the cognitive perspective is addressed. To achieve this goal, forthcoming programming developments should target at oriented meta-languages with structured documents in XML.

Keywords: hypernymy, information retrieval, lightweight ontology, resonance

Procedia PDF Downloads 101
97 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 324
96 ‘Daily Speaking’: Designing an App for Construction of Language Learning Model Supporting ‘Seamless Flipped’ Environment

Authors: Zhou Hong, Gu Xiao-Qing, Lıu Hong-Jiao, Leng Jing

Abstract:

Seamless learning is becoming a research hotspot in recent years, and the emerging of micro-lectures, flipped classroom has strengthened the development of seamless learning. Based on the characteristics of the seamless learning across time and space and the course structure of the flipped classroom, and the theories of language learning, we put forward the language learning model which can support ‘seamless flipped’ environment (abbreviated as ‘S-F’). Meanwhile, the characteristics of the ‘S-F’ learning environment, the corresponding framework construction and the activity design of diversified corpora were introduced. Moreover, a language learning app named ‘Daily Speaking’ was developed to facilitate the practice of the language learning model in ‘S-F’ environment. In virtue of the learning case of Shanghai language, the rationality and feasibility of this framework were examined, expecting to provide a reference for the design of ‘S-F’ learning in different situations.

Keywords: seamless learning, flipped classroom, seamless-flipped environment, language learning model

Procedia PDF Downloads 155
95 The Grammatical Dictionary Compiler: A System for Kartvelian Languages

Authors: Liana Lortkipanidze, Nino Amirezashvili, Nino Javashvili

Abstract:

The purpose of the grammatical dictionary is to provide information on the morphological and syntactic characteristics of the basic word in the dictionary entry. The electronic grammatical dictionaries are used as a tool of automated morphological analysis for texts processing. The Georgian Grammatical Dictionary should contain grammatical information for each word: part of speech, type of declension/conjugation, grammatical forms of the word (paradigm), alternative variants of basic word/lemma. In this paper, we present the system for compiling the Georgian Grammatical Dictionary automatically. We propose dictionary-based methods for extending grammatical lexicons. The input lexicon contains only a few number of words with identical grammatical features. The extension is based on similarity measures between features of words; more precisely, we add words to the extended lexicons, which are similar to those, which are already in the grammatical dictionary. Our dictionaries are corpora-based, and for the compiling, we introduce the method for lemmatization of unknown words, i.e., words of which neither full form nor lemma is in the grammatical dictionary.

Keywords: acquisition of lexicon, Georgian grammatical dictionary, lemmatization rules, morphological processor

Procedia PDF Downloads 118
94 Women’s Language and Gender Positioning in the Discourse of Indonesian Instagram Videos

Authors: Haira Rizka, Imas Istiani

Abstract:

The way women and men use language is an interesting topic to discuss. Nowadays, Instagram shows many videos which illustrate the difference of women’s and men’s language. Furthermore, the videos show how different genders behave in daily communication. This research aims to (1) investigate conversational characteristics of women represented in Indonesian Instagram videos, and (2) investigate how different genders behave in daily communication. To analyze the two research problems, this research employs Tannen’s theory of language and gender (1996). This is a descriptive qualitative research which describes phenomena of language and gender shown in Indonesian Instagram videos. The data were collected through observation. The collected data were then analyzed by employing ethnography and textual analysis. The research results show that in Indonesian Instagram videos, women dominate the conversation than men. Women’s are portrayed as a figure who are talkative, never wrong, and sensitive. Women’s dominating men proves that women always want to be understood, produce more words than men, and are more creative in producing verbal communication. Meanwhile, men are portrayed as calm, gentle, and patient creature who listen to women’s talk. Furthermore, men are portrayed to prefer being silent for avoiding conflict.

Keywords: gender, Instagram videos, language variety, women's language

Procedia PDF Downloads 398
93 Chatbots vs. Websites: A Comparative Analysis Measuring User Experience and Emotions in Mobile Commerce

Authors: Stephan Boehm, Julia Engel, Judith Eisser

Abstract:

During the last decade communication in the Internet transformed from a broadcast to a conversational model by supporting more interactive features, enabling user generated content and introducing social media networks. Another important trend with a significant impact on electronic commerce is a massive usage shift from desktop to mobile devices. However, a presentation of product- or service-related information accumulated on websites, micro pages or portals often remains the pivot and focal point of a customer journey. A more recent change of user behavior –especially in younger user groups and in Asia– is going along with the increasing adoption of messaging applications supporting almost real-time but asynchronous communication on mobile devices. Mobile apps of this type cannot only provide an alternative for traditional one-to-one communication on mobile devices like voice calls or short messaging service. Moreover, they can be used in mobile commerce as a new marketing and sales channel, e.g., for product promotions and direct marketing activities. This requires a new way of customer interaction compared to traditional mobile commerce activities and functionalities provided based on mobile web-sites. One option better aligned to the customer interaction in mes-saging apps are so-called chatbots. Chatbots are conversational programs or dialog systems simulating a text or voice based human interaction. They can be introduced in mobile messaging and social media apps by using rule- or artificial intelligence-based imple-mentations. In this context, a comparative analysis is conducted to examine the impact of using traditional websites or chatbots for promoting a product in an impulse purchase situation. The aim of this study is to measure the impact on the customers’ user experi-ence and emotions. The study is based on a random sample of about 60 smartphone users in the group of 20 to 30-year-olds. Participants are randomly assigned into two groups and participate in a traditional website or innovative chatbot based mobile com-merce scenario. The chatbot-based scenario is implemented by using a Wizard-of-Oz experimental approach for reasons of sim-plicity and to allow for more flexibility when simulating simple rule-based and more advanced artificial intelligence-based chatbot setups. A specific set of metrics is defined to measure and com-pare the user experience in both scenarios. It can be assumed, that users get more emotionally involved when interacting with a system simulating human communication behavior instead of browsing a mobile commerce website. For this reason, innovative face-tracking and analysis technology is used to derive feedback on the emotional status of the study participants while interacting with the website or the chatbot. This study is a work in progress. The results will provide first insights on the effects of chatbot usage on user experiences and emotions in mobile commerce environments. Based on the study findings basic requirements for a user-centered design and implementation of chatbot solutions for mobile com-merce can be derived. Moreover, first indications on situations where chatbots might be favorable in comparison to the usage of traditional website based mobile commerce can be identified.

Keywords: chatbots, emotions, mobile commerce, user experience, Wizard-of-Oz prototyping

Procedia PDF Downloads 428
92 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 131
91 The Language of Fliptop among Filipino Youth: A Discourse Analysis

Authors: Bong Borero Lumabao

Abstract:

This qualitative research is a study on the lines of Fliptop talks performed by the Fliptop rappers employing Finnegan’s (2008) discourse analysis. This paper aimed to analyze the phonological, morphological, and semantic features of the fliptop talk, to explore the structures in the lines of Fliptop among Filipino youth, and to uncover the various insights that can be gained from it. The corpora of the study included all the 20 Fliptop Videos downloaded from the Youtube Channel of Fliptop. Results revealed that Fliptop contains phonological features such as assonance, consonance, deletion, lengthening, and rhyming. Morphological features include acronym, affixation, blending, borrowing, code-mixing and switching, compounding, conversion or functional shifts, and dysphemism. Semantics presented the lexical category, meaning, and words used in the fliptop talks. Structure of Fliptop revolves on the personal attack (physical attributes), attack on the bars (rapping skills), extension: family members and friends, antithesis, profane words, figurative languages, sexual undertones, anime characters, homosexuality, and famous celebrities involvement.

Keywords: discourse analysis, fliptop talks, filipino youth, fliptop videos, Philippines

Procedia PDF Downloads 202
90 Corporate Cautionary Statement: A Genre of Professional Communication

Authors: Chie Urawa

Abstract:

Cautionary statements or disclaimers in corporate annual reports need to be carefully designed because clear cautionary statements may protect a company in the case of legal disputes and may undermine positive impressions. This study compares the language of cautionary statements using two corpora, Sony’s cautionary statement corpus (S-corpus) and Panasonic’s cautionary statement corpus (P-corpus), illustrating the differences and similarities in relation to the use of meaningful cautionary statements and critically analyzing why practitioners use the way. The findings describe the distinct differences between the two companies in the presentation of the risk factors and the way how they make the statements. The word ability is used more for legal protection in S-corpus whereas the word possibility is used more to convey a better impression in P-corpus. The main similarities are identified in the use of lexical words and pronouns, and almost the same wordings for eight years. The findings show how they make the statements unique to the company in the presentation of risk factors, and the characteristics of specific genre of professional communication. Important implications of this study are that more comprehensive approach can be applied in other contexts, and be used by companies to reflect upon their cautionary statements.

Keywords: cautionary statements, corporate annual reports, corpus, risk factors

Procedia PDF Downloads 143
89 A Model for Analysing Argumentative Structures and Online Deliberation in User-Generated Comments to the Website of a South African Newspaper

Authors: Marthinus Conradie

Abstract:

The conversational dynamics of democratically orientated deliberation continue to stimulate critical scholarship for its potential to bolster robust engagement between different sections of pluralist societies. Several axes of deliberation that have attracted academic attention include face-to-face vs. online interaction, and citizen-to-citizen communication vs. engagement between citizens and political elites. In all these areas, numerous researchers have explored deliberative procedures aimed at achieving instrumental goals such a securing consensus on policy issues, against procedures that prioritise expressive outcomes such as broadening the range of argumentative repertoires that discursively construct and mediate specific political issues. The study that informs this paper, works in the latter stream. Drawing its data from the reader-comments section of a South African broadsheet newspaper, the study investigates online, citizen-to-citizen deliberation by analysing the discursive practices through which competing understandings of social problems are articulated and contested. To advance this agenda, the paper deals specifically with user-generated comments posted in response to news stories on questions of race and racism in South Africa. The analysis works to discern and interpret the various sets of discourse practices that shape how citizens deliberate contentious political issues, especially racism. Since the website in question is designed to encourage the critical comparison of divergent interpretations of news events, without feeding directly into national policymaking, the study adopts an analytic framework that traces how citizens articulate arguments, rather than the instrumental effects that citizen deliberations might exert on policy. The paper starts from the argument that such expressive interactions are particularly crucial to current trends in South African politics, given that the precise nature of race and racism remain contested and uncertain. Centred on a sample of 2358 conversational moves in 814 posts to 18 news stories emanating from issues of race and racism, the analysis proceeds in a two-step fashion. The first stage conducts a qualitative content analysis that offers insights into the levels of reciprocity among commenters (do readers engage with each other or simply post isolated opinions?), as well as the structures of argumentation (do readers support opinions by citing evidence?). The second stage involves a more fine-grained discourse analysis, based on a theorisation of argumentation that delineates it into three components: opinions/conclusions, evidence/data to support opinions/conclusions and warrants that explicate precisely how evidence/data buttress opinions/conclusions. By tracing the manifestation and frequency of specific argumentative practices, this study contributes to the archive of research currently aggregating around the practices that characterise South Africans’ engagement with provocative political questions, especially racism and racial inequity. Additionally, the study also contributes to recent scholarship on the affordances of Web 2.0 software by eschewing a simplistic bifurcation between cyber-optimist vs. pessimism, in favour of a more nuanced and context-specific analysis of the patterns that structure online deliberation.

Keywords: online deliberation, discourse analysis, qualitative content analysis, racism

Procedia PDF Downloads 151
88 Wellbeing Effects from Family Literacy Education: An Ecological Study

Authors: Jane Furness, Neville Robertson, Judy Hunter, Darrin Hodgetts, Linda Nikora

Abstract:

Background and significance: This paper describes the first use of community psychology theories to investigate family-focused literacy education programmes, enabling a wide range of wellbeing effects of such programmes to be identified for the first time. Evaluations of family literacy programmes usually focus on the economic advantage of gains in literacy skills. By identifying other effects on aspects of participants’ lives that are important to them, and how they occur, understanding of how such programmes contribute to wellbeing and social justice is augmented. Drawn from community psychology, an ecological systems-based, culturally adaptive framework for personal, relational and collective wellbeing illuminated outcomes of family literacy programmes that enhanced wellbeing and quality of life for adult participants, their families and their communities. All programmes, irrespective of their institutional location, could be similarly scrutinized. Methodology: The study traced the experiences of nineteen adult participants in four family-focused literacy programmes located in geographically and culturally different communities throughout New Zealand. A critical social constructionist paradigm framed this interpretive study. Participants were mainly Māori, Pacific islands, or European New Zealanders. Seventy-nine repeated conversational interviews were conducted over 18 months with the adult participants, programme staff and people who knew the participants well. Twelve participant observations of programme sessions were conducted, and programme documentation was reviewed. Latent theoretical thematic analysis of data drew on broad perspectives of literacy and ecological systems theory, network theory and holistic, integrative theories of wellbeing. Steps taken to co-construct meaning with participants included the repeated conversational interviews and participant checking of interview transcripts and section drafts. The researcher (this paper’s first author) followed methodological guidelines developed by indigenous peoples for non-indigenous researchers. Findings: The study found that the four family literacy programmes, differing in structure, content, aims and foci, nevertheless shared common principles and practices that reflected programme staff’s overarching concern for people’s wellbeing along with their desire to enhance literacy abilities. A human rights and strengths-based based view of people based on respect for diverse culturally based values and practices were evident in staff expression of their values and beliefs and in their practices. This enacted stance influenced the outcomes of programme participation for the adult participants, their families and their communities. Alongside the literacy and learning gains identified, participants experienced positive social and relational events and changes, affirmation and strengthening of their culturally based values, and affirmation and building of positive identity. Systemically, interconnectedness of programme effects with participants’ personal histories and circumstances; the flow on of effects to other aspects of people’s lives and to their families and communities; and the personalised character of the pathways people journeyed towards enhanced wellbeing were identified. Concluding statement: This paper demonstrates the critical contribution of community psychology to a fuller understanding of family-focused educational programme outcomes than has been previously attainable, the meaning of these broader outcomes to people in their lives, and their role in wellbeing and social justice.

Keywords: community psychology, ecological theory, family literacy education, flow on effects, holistic wellbeing

Procedia PDF Downloads 226
87 A Socio-Pragmatic Investigation of Gender Enactment in New Month Text Messages

Authors: Esther Robert, Romanus Aboh

Abstract:

This paper undertakes a socio-pragmatic investigation of gender enactment in new month text messages. This study employs Gumperz’s Interactional Sociolinguistics as its theoretical point of reference to investigate how people create meaning through social interaction. This theory attempts to analyse any social interaction based on contextualization cues and presuppositions. This study explores the appropriateness of language used in texting. The text messages are collected from different mobile phones from different genders, which form the data for this paper. The study observes remarkable differences between genders in the use of informal language. The study reveals that men and women differ remarkably in conversational interaction as well as in writing. While it is observed that women are emotional, orderly, and meticulous, detailed and observed certain grammatical rules, men are casual, brief and appear to show evidence that less attention is paid to grammatical rules. Also, the study shows women as relaxing, showing love, care, concern with their emotive, spirit-raising and touching language, while mean are direct, short, and straight to the point. It is discovered through the study that women behave this way because of their brain-wiring. That is why language and communication matter more to women than to men and this reflects in their new month text messages.

Keywords: difference, emotionalised expressions, gender, texting

Procedia PDF Downloads 221
86 A Comparison of the First Language Vocabulary Used by Indonesian Year 4 Students and the Vocabulary Taught to Them in English Language Textbooks

Authors: Fitria Ningsih

Abstract:

This study concerns on the process of making corpus obtained from Indonesian year 4 students’ free writing compared to the vocabulary taught in English language textbooks. 369 students’ sample writings from 19 public elementary schools in Malang, East Java, Indonesia and 5 selected English textbooks were analyzed through corpus in linguistics method using AdTAT -the Adelaide Text Analysis Tool- program. The findings produced wordlists of the top 100 words most frequently used by students and the top 100 words given in English textbooks. There was a 45% match between the two lists. Furthermore, the classifications of the top 100 most frequent words from the two corpora based on part of speech found that both the Indonesian and English languages employed a similar use of nouns, verbs, adjectives, and prepositions. Moreover, to see the contextualizing the vocabulary of learning materials towards the students’ need, a depth-analysis dealing with the content and the cultural views from the vocabulary taught in the textbooks was discussed through the criteria developed from the checklist. Lastly, further suggestions are addressed to language teachers to understand the students’ background such as recognizing the basic words students acquire before teaching them new vocabulary in order to achieve successful learning of the target language.

Keywords: corpus, frequency, English, Indonesian, linguistics, textbooks, vocabulary, wordlists, writing

Procedia PDF Downloads 160
85 Study of Multimodal Resources in Interactions Involving Children with Autistic Spectrum Disorders

Authors: Fernanda Miranda da Cruz

Abstract:

This paper aims to systematize, descriptively and analytically, the relations between language, body and material world explored in a specific empirical context: everyday co-presence interactions between children diagnosed with Autistic Spectrum Disease ASD and various interlocutors. We will work based on 20 hours of an audiovisual corpus in Brazilian Portuguese language. This analysis focuses on 1) the analysis of daily interactions that have the presence/participation of subjects with a diagnosis of ASD based on an embodied interaction perspective; 2) the study of the status and role of gestures, body and material world in the construction and constitution of human interaction and its relation with linguistic-cognitive processes and Autistic Spectrum Disorders; 3) to highlight questions related to the field of videoanalysis, such as: procedures for recording interactions in complex environments (involving many participants, use of objects and body movement); the construction of audiovisual corpora for linguistic-interaction research; the invitation to a visual analytical mentality of human social interactions involving not only the verbal aspects that constitute it, but also the physical space, the body and the material world.

Keywords: autism spectrum disease, multimodality, social interaction, non-verbal interactions

Procedia PDF Downloads 91
84 Semantic Preference across Research Articles: A Corpus-Based Study of Adjectives in English

Authors: Valdênia Carvalho e Almeida

Abstract:

The goal of the present study is to investigate the semantic preference of the most frequent adjectives in research articles through a corpus-based analysis of texts published in journals in Applied Linguistics (AL). The corpus used in this study contains texts published in the period from 2014 to 2018 in the three journals: Language Learning and Technology; English for Academic Purposes, and TESOL Quaterly, totaling more than one million words. A corpus-based analysis was carried out on the corpus to identify the most frequent adjectives that co-occurred in the three journals. By observing the concordance lines of the adjectives and analyzing the words they associated with, the semantic preferences of each adjective were determined. Later, the AL corpus analysis was compared to the investigation of the same adjectives in a corpus of Chemistry. This second part of the study aimed to identify possible differences and similarities between the two corpora in relation to the use of the adjectives in research articles from both areas. The results show that there are some preferences which seem to be closely related not only to the academic genre of the texts but also to the specific domain of the discipline and, to a lesser extent, to the context of research in each journal. This research illustrates a possible contribution of Corpus Linguistics to explore the concept of semantic preference in more detail, considering the complex nature of the phenomenon.

Keywords: applied linguistics, corpus linguistics, chemistry, research article, semantic preference

Procedia PDF Downloads 155
83 Early Childhood Education in a Depressed Economy in Nigeria: Implication in the Classroom

Authors: Ogunnaiya Racheal Taiwo

Abstract:

Children's formative years are crucial to their growth; it is, therefore, necessary for all the stakeholders to ensure that the pupils have an enabling quality of life which is essential for realizing their potential. For children to live and grow, they need a secure home, nutritious food, good health care, and quality education. This paper, therefore, investigates the implications of a depressed economy on the classroom learning of Nigerian children as it is clear that Nigeria is currently experiencing the worst economic depression in several decades, which affects a substantial proportion of children. The study is qualitative research, and it adopts a phenomenological approach where the experiences of respondents are examined qualitatively. Three senatorial districts in Oyo State were considered, and 50 teachers, both male, and female were chosen from each senatorial district for an interview through conversational key informants' interviews. The interviewees were recorded, transcribed, and presented using thematic analysis. Findings showed that more children have dropped out since the beginning of the year than in previous years. It was also recorded that learning has become challenging as children now find it harder to acquire learning materials. It was recommended that the government should reimburse early childhood schools to lessen the effect of the inability to purchase materials and pay school fees. It was also recommended that an intervention be made to approach and resolve issues associated with out-of-school children.

Keywords: childhood, classroom, education, depressed economy, poverty

Procedia PDF Downloads 73
82 Corpus-Based Analysis on the Translatability of Conceptual Vagueness in Traditional Chinese Medicine Classics Huang Di Nei Jing

Authors: Yan Yue

Abstract:

Huang Di Nei Jing (HDNJ) is one of the significant traditional Chinese medicine (TCM) classics which lays the foundation of TCM theory and practice. It is an important work for the world to study the ancient civilizations and medical history of China. Language in HDNJ is highly concise and vague, and notably challenging to translate. This paper investigates the translatability of one particular vagueness in HDNJ: the conceptual vagueness which carries the Chinese philosophical and cultural connotations. The corpora tool Sketch Engine is used to provide potential online contexts and word behaviors. Selected two English translations of HDNJ by TCM practitioner and non-practitioner are used to examine frequency and distribution of linguistic features of the translation. It was found the hypothesis about the universals of translated language (explicitation, normalisation) is true in one translation, but it is on the sacrifice of some original contextual connotations. Transliteration is purposefully used in the second translation to retain the original flavor, which is argued as a violation of the principle of relevance in communication because it yields little contextual effects and demands more processing effort of the reader. The translatability of conceptual vagueness in HDNJ is constrained by source language context and the reader’s cognitive environment.

Keywords: corpus-based translation, translatability, TCM classics, vague language

Procedia PDF Downloads 345
81 Evaluation and Compression of Different Language Transformer Models for Semantic Textual Similarity Binary Task Using Minority Language Resources

Authors: Ma. Gracia Corazon Cayanan, Kai Yuen Cheong, Li Sha

Abstract:

Training a language model for a minority language has been a challenging task. The lack of available corpora to train and fine-tune state-of-the-art language models is still a challenge in the area of Natural Language Processing (NLP). Moreover, the need for high computational resources and bulk data limit the attainment of this task. In this paper, we presented the following contributions: (1) we introduce and used a translation pair set of Tagalog and English (TL-EN) in pre-training a language model to a minority language resource; (2) we fine-tuned and evaluated top-ranking and pre-trained semantic textual similarity binary task (STSB) models, to both TL-EN and STS dataset pairs. (3) then, we reduced the size of the model to offset the need for high computational resources. Based on our results, the models that were pre-trained to translation pairs and STS pairs can perform well for STSB task. Also, having it reduced to a smaller dimension has no negative effect on the performance but rather has a notable increase on the similarity scores. Moreover, models that were pre-trained to a similar dataset have a tremendous effect on the model’s performance scores.

Keywords: semantic matching, semantic textual similarity binary task, low resource minority language, fine-tuning, dimension reduction, transformer models

Procedia PDF Downloads 179
80 Statistical Comparison of Machine and Manual Translation: A Corpus-Based Study of Gone with the Wind

Authors: Yanmeng Liu

Abstract:

This article analyzes and compares the linguistic differences between machine translation and manual translation, through a case study of the book Gone with the Wind. As an important carrier of human feeling and thinking, the literature translation poses a huge difficulty for machine translation, and it is supposed to expose distinct translation features apart from manual translation. In order to display linguistic features objectively, tentative uses of computerized and statistical evidence to the systematic investigation of large scale translation corpora by using quantitative methods have been deployed. This study compiles bilingual corpus with four versions of Chinese translations of the book Gone with the Wind, namely, Piao by Chunhai Fan, Piao by Huairen Huang, translations by Google Translation and Baidu Translation. After processing the corpus with the software of Stanford Segmenter, Stanford Postagger, and AntConc, etc., the study analyzes linguistic data and answers the following questions: 1. How does the machine translation differ from manual translation linguistically? 2. Why do these deviances happen? This paper combines translation study with the knowledge of corpus linguistics, and concretes divergent linguistic dimensions in translated text analysis, in order to present linguistic deviances in manual and machine translation. Consequently, this study provides a more accurate and more fine-grained understanding of machine translation products, and it also proposes several suggestions for machine translation development in the future.

Keywords: corpus-based analysis, linguistic deviances, machine translation, statistical evidence

Procedia PDF Downloads 115
79 Engagement Resources Use by Expert and Novice EFL Academic Writers

Authors: Moharram Sharifi

Abstract:

The purpose of this study was to show how expert and novice writers take positions and stances in Research Articles and Master of Art theses Introductions, so Engagement resources were investigated in 30 Research Articles and 30 Master of Art theses written by Iranian non-native speakers. Through paired samples t-test analysis, we found out that the mean occurrences of heteroglossic items in both RA and Master thesis Introductions were larger than those of monoglossic items, indicating the awareness of both groups of writers to ‘engage’ alternative positions in Introduction sections. The results also revealed that expansive choices were preferred over contractive options in both corpora, implying both groups of writers respect alternative voices cautiously by welcoming rather than closing down the possibility of different perspectives and stances. Furthermore, unlike novice academic writers who used more Attribute features than Entertainment ones in their MATs introduction sections, expert academic writers employed a balanced number of Entertainment and Attribute in their RA introduction sections. The balanced deployment of entertaining and Attribute features in RA Introductions by expert writers might be characteristics of the writers’ demonstration of politeness, which is commonly accepted as an essential feature in academic writing discourse. Finally, through qualitative analysis, it was demonstrated that MAT writers, as novice academic writers, suffered from lacking appropriate evaluative stances and authorial voices toward propositions.

Keywords: novice, expert, engagement, RA Introductions, MA Thesis

Procedia PDF Downloads 15
78 Urban Metis Women’s Identity and Experiences with Health Services in Toronto, Ontario

Authors: Renee Monchalin

Abstract:

Métis peoples, while comprising over a third of the total Indigenous population in Canada, experience major gaps in health services that accommodate their cultural identities. This is problematic given Métis peoples experience severe disparities in health determinants and outcomes compared to the non-Indigenous Canadian population. At the same time, Métis are unlikely to engage in health services that do not value their cultural identities, often utilizing mainstream options. Given these contexts, this research aims to fill the culturally-safe health care gap for Métis peoples in Canada. It does this by engaging 56 urban Métis women who participated in a longitudinal cohort study, Our Health Counts (OHC) Toronto. Traditionally, Métis women were central to the health and well-being of their communities. However, due to decades of colonial legislation and forced land displacement, female narratives have been silenced, and Métis identities have been fractured. This has resulted in having direct implications on Métis people’s current health and access to health services. Solutions to filling the Métis health service gap may lie in the all too often unacknowledged or missing voices of Métis women. Through a conversational method, this research will explore urban Métis women’s perspectives on identity and their experiences with health services in Toronto. The goal of this research is to learn from urban Métis women on steps towards filling the health service gap. This research is currently in the data collection stage. Preliminary findings from the conversations will be disseminated. Policy recommendations for health service providers will be provided to better accommodate Métis people.

Keywords: indigenous health, Metis health, urban, health service access, identity

Procedia PDF Downloads 189