Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 72

Search results for: lexicon

72 Lexicon-Based Sentiment Analysis for Stock Movement Prediction

Authors: Zane Turner, Kevin Labille, Susan Gauch

Abstract:

Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We introduce a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.

Keywords: computational finance, sentiment analysis, sentiment lexicon, stock movement prediction

Procedia PDF Downloads 73
71 Lexicon-Based Sentiment Analysis for Stock Movement Prediction

Authors: Zane Turner, Kevin Labille, Susan Gauch

Abstract:

Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We present a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.

Keywords: computational finance, sentiment analysis, sentiment lexicon, stock movement prediction

Procedia PDF Downloads 49
70 Expressivity of Word-Formation in English and Russian Advertising Lexicon

Authors: Voronina Ekaterina Borisovna

Abstract:

The problem of expressivity of advertising lexicon is studied in the article. The comparison of English and Russian advertising lexicons is done. The objects of the analysis were English and Russian advertising texts, both printed advertising texts and texts extracted from the commercials. Some conclusions concerning the expressivity of advertising lexicon were made. Expressivity can be included in the semantic structure of words or created by word-formation means. Expressivity caused by morphological derivatives includes such facilities as derivational affixes, models and types of word formation.

Keywords: advertising lexicon, expressivity, word-formation means, linguistics

Procedia PDF Downloads 277
69 Arabic Lexicon Learning to Analyze Sentiment in Microblogs

Authors: Mahmoud B. Rokaya

Abstract:

The study of opinion mining and sentiment analysis includes analysis of opinions, sentiments, evaluations, attitudes, and emotions. The rapid growth of social media, social networks, reviews, forum discussions, microblogs, and Twitter, leads to a parallel growth in the field of sentiment analysis. The field of sentiment analysis tries to develop effective tools to make it possible to capture the trends of people. There are two approaches in the field, lexicon-based and corpus-based methods. A lexicon-based method uses a sentiment lexicon which includes sentiment words and phrases with assigned numeric scores. These scores reveal if sentiment phrases are positive or negative, their intensity, and/or their emotional orientations. Creation of manual lexicons is hard. This brings the need for adaptive automated methods for generating a lexicon. The proposed method generates dynamic lexicons based on the corpus and then classifies text using these lexicons. In the proposed method, different approaches are combined to generate lexicons from text. The proposed method classifies the tweets into 5 classes instead of +ve or –ve classes. The sentiment classification problem is written as an optimization problem, finding optimum sentiment lexicons are the goal of the optimization process. The solution was produced based on mathematical programming approaches to find the best lexicon to classify texts. A genetic algorithm was written to find the optimal lexicon. Then, extraction of a meta-level feature was done based on the optimal lexicon. The experiments were conducted on several datasets. Results, in terms of accuracy, recall and F measure, outperformed the state-of-the-art methods proposed in the literature in some of the datasets. A better understanding of the Arabic language and culture of Arab Twitter users and sentiment orientation of words in different contexts can be achieved based on the sentiment lexicons proposed by the algorithm.

Keywords: social media, Twitter sentiment, sentiment analysis, lexicon, genetic algorithm, evolutionary computation

Procedia PDF Downloads 106
68 Variety and the Distribution of the Java Language Lexicon “Sleeping” in Jombang District East Java: Study of Geographic Dialectology

Authors: Krismonika Khoirunnisa

Abstract:

This research article aims to describe the variation of the Javanese lexicon "Sleep " and its distribution in the Jombang area, East Java. The objectives of this study were (1) to classify the variation of the "Sleep" lexicon in the Jombang area and (2) to design the fish rips for the variation of the "Sleep" lexicon according to their distribution. This type of research is a qualitative descriptive study using the method of leading proficiency, namely conducting interviews with speakers without directly meeting the speakers (interviews via WhatsApp and email as the media). This research article uses techniques record as support and tools for mapping and classifying data, collecting data in this study conducted at four points, namely the Kaliwungu village (Jombang City), Banjardowo village (District of Jombang), Mayangan Village (Subdistrict Jogoroto), and Karobelah village (Subdistrict Mojoagung) as a target investigators to conduct the interview. This study uses the dialectology theory as a basis for analyzing the data obtained. The results of this study found that the Javanese language variation "Sleep" has many different linguals, meanings, and forms even though they are in the same area (Jombang).

Keywords: geographical dialectology, lexicon variations, jombangan dialect, sssavanese language

Procedia PDF Downloads 118
67 A Proposed Approach for Emotion Lexicon Enrichment

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.

Keywords: document analysis, sentimental analysis, emotion detection, WEKA tool, NRC lexicon

Procedia PDF Downloads 341
66 Descriptive Analysis of Variations in Maguindanaon Language

Authors: Fhajema Kunso

Abstract:

People who live in the same region and who seemed to speak the same language still vary in some aspects of their language. The variation may occur in terms of pronunciation, lexicon, morphology, and syntax. This qualitative study described the phonological, morphological, and lexical variations of the Maguindanaon language among the ten Maguindanao municipalities. Purposive sampling, in-depth interviews, focus group discussion, and sorting and classifying of words according to phonological and morphological as well as lexical structures in data analysis were employed. The variations occurred through phonemic changes and other phonological processes and morphological processes. Phonological processes consisted of vowel lengthening and deletion while morphological processes included affixation, borrowing, and coinage. In the phonological variation, it was observed that there were phonemic changes in one dialect to another. For example, there was a change of phoneme /r/ to /l/. The phoneme /r/ was most likely to occur in Kabuntalan like /biru/, /kurIt/, and /kɘmɅr/ whereas in the rest of the dialects these were /bilu/, /kuIɪt/, and /kɘmɅl/ respectively. Morphologically, the affixation was the main way to know the tenses. For example, the root sarig (expect) when inserted with im becomes simarig, i.e. s + im + arig = simarig (expected). Lexical variation also existed in the Maguindanaon language. Results revealed that the variation in phonology, morphology, and lexicon were observed to be associated primarily on geographic distribution.

Keywords: applied linguistics, language, lexicon, Maguindanao, morphology, Philippines, phonology, processes, qualitative, variation

Procedia PDF Downloads 198
65 Receptive Vocabulary Development in Adolescents and Adults with Down Syndrome

Authors: Esther Moraleda Sepúlveda, Soraya Delgado Matute, Paula Salido Escudero, Raquel Mimoso García, M Cristina Alcón Lancho

Abstract:

Although there is some consensus when it comes to establishing the lexicon as one of the strengths of language in people with Down Syndrome (DS), little is known about its evolution throughout development and changes based on age. The objective of this study was to find out if there are differences in receptive vocabulary between adolescence and adulthood. In this research, 30 people with DS between 11 and 40 years old, divided into two age ranges (11-18; 19 - 30) and matched in mental age, were evaluated through the Peabody Vocabulary Test. The results show significant differences between both groups in favor of the group with the oldest chronological age and a direct correlation between chronological age and receptive vocabulary development, regardless of mental age. These data support the natural evolution of the passive lexicon in people with DS.

Keywords: down syndrome, language, receptive vocabulary, adolescents, adults

Procedia PDF Downloads 118
64 The EFL Mental Lexicon: Connectivity and the Acquisition of Lexical Knowledge Depth

Authors: Khalid Soussi

Abstract:

The study at hand has attempted to describe the acquisition of three EFL lexical knowledge aspects - meaning, synonymy and collocation – across three academic levels: Baccalaureate, second year and fourth year university levels in Morocco. The research also compares the development of the three lexical knowledge aspects between knowledge (reception) and use (production) and attempts to trace their order of acquisition. This has led to the use of three main data collection tasks: translation, acceptability judgment and multiple choices. The study has revealed the following findings. First, L1 and EFL mental lexicons are connected at the lexical knowledge depth. Second, such connection is active whether in language reception or use. Third, the connectivity between L1 and EFL mental lexicons tends to relatively decrease as the academic level of the learners increases. Finally, the research has revealed a significant 'order' of acquisition between the three lexical aspects, though not a very strong one.

Keywords: vocabulary acquisition, EFL lexical knowledge, mental lexicon, vocabulary knowledge depth

Procedia PDF Downloads 217
63 Unsupervised Sentiment Analysis for Indonesian Political Message on Twitter

Authors: Omar Abdillah, Mirna Adriani

Abstract:

In this work, we perform new approach for analyzing public sentiment towards the presidential candidate in the 2014 Indonesian election that expressed in Twitter. In this study we propose such procedure for analyzing sentiment over Indonesian political message by understanding the behavior of Indonesian society in sending message on Twitter. We took different approach from previous works by utilizing punctuation mark and Indonesian sentiment lexicon that completed with the new procedure in determining sentiment towards the candidates. Our experiment shows the performance that yields up to 83.31% of average precision. In brief, this work makes two contributions: first, this work is the preliminary study of sentiment analysis in the domain of political message that has not been addressed yet before. Second, we propose such method to conduct sentiment analysis by creating decision making procedure in which it is in line with the characteristic of Indonesian message on Twitter.

Keywords: unsupervised sentiment analysis, political message, lexicon based, user behavior understanding

Procedia PDF Downloads 397
62 BiLex-Kids: A Bilingual Word Database for Children 5-13 Years Old

Authors: Aris R. Terzopoulos, Georgia Z. Niolaki, Lynne G. Duncan, Mark A. J. Wilson, Antonios Kyparissiadis, Jackie Masterson

Abstract:

As word databases for bilingual children are not available, researchers, educators and textbook writers must rely on monolingual databases. The aim of this study is thus to develop a bilingual word database, BiLex-kids, an online open access developmental word database for 5-13 year old bilingual children who learn Greek as a second language and have English as their dominant one. BiLex-kids is compiled from 120 Greek textbooks used in Greek-English bilingual education in the UK, USA and Australia, and provides word translations in the two languages, pronunciations in Greek, and psycholinguistic variables (e.g. Zipf, Frequency per million, Dispersion, Contextual Diversity, Neighbourhood size). After clearing the textbooks of non-relevant items (e.g. punctuation), algorithms were applied to extract the psycholinguistic indices for all words. As well as one total lexicon, the database produces values for all ages (one lexicon for each age) and for three age bands (one lexicon per age band: 5-8, 9-11, 12-13 years). BiLex-kids provides researchers with accurate figures for a wide range of psycholinguistic variables, making it a useful and reliable research tool for selecting stimuli to examine lexical processing among bilingual children. In addition, it offers children the opportunity to study word spelling, learn translations and listen to pronunciations in their second language. It further benefits educators in selecting age-appropriate words for teaching reading and spelling, while special educational needs teachers will have a resource to control the content of word lists when designing interventions for bilinguals with literacy difficulties.

Keywords: bilingual children, psycholinguistics, vocabulary development, word databases

Procedia PDF Downloads 232
61 Anglicisms in the Magazine Glamour France: The Influence of English on the French Language of Fashion

Authors: Vivian Orsi

Abstract:

In this research, we aim to investigate the lexicon of women's magazines, with special attention to fashion, whose universe is very receptive to lexical borrowings, especially those from English, called Anglicisms. Thus, we intend to discuss the presence of English items and expressions on the online French women's magazine Glamour France collected from six months. Highlighting the quantitative aspects of the use of English in that publication, we can affirm that the use of those lexical borrowings seems to represent sophistication to attract readers and identification with other cultures, establishing communication and intensifying the language of fashion. The potential for creativity in fashion lexicon is made possible by its permeability to social and linguistic phenomena across all social classes that allow constant manipulation of genuine borrowings. Besides, it seems to assume the value of prerequisite to participate in the fashion centers of the world. The use of Anglicisms in Glamour France is not limited to designate concepts and fashionable items that have no equivalent in French, but it acts as a kind of seduction tool, which uses the symbolic capital of English as the global language of communication.

Keywords: Anglicisms, lexicology, borrowings, fashion language

Procedia PDF Downloads 204
60 Metaphors Underlying Idiomatic Expressions in Trilingual Perspective: Contributions to the Teaching of Lexicon and to Materials Development

Authors: Marilei Amadeu Sabino

Abstract:

Idiomatic expressions are linguistic phraseologisms present in natural languages. Known to be metaphorical linguistic combinations, a good majority of them provide elements that reveal important cultural aspects of their linguistic community through their metaphors. With the advent of Cognitive Linguistics (more specifically of Cognitive Semantics), the metaphor ceased to be related to poetic language and rhetorical embellishment and came to be seen as part of simple everyday language, reflecting the way human beings think, act and conceive reality, i. e., a fundamental mechanism of human conceptualizations of the world. In this sense, it came to be conceived as an inevitable mechanism for representing the nature of thought and language. The speakers, in conceptualizing reality, often use metaphorically parts of the body in expressions known as somatic. Several conceptual metaphors appear to be potentially universal or near-universal, because people across the world share certain bodily experiences. In these terms, many linguistic metaphors may be identical or very similar in several languages. These similarities, according to the Theory of Conceptual Metaphor, derive from universal aspects of the human body. Thus, this research aims to investigate the nature of some metaphors underlying somatic idiomatic expressions of Portuguese, Italian and English languages, establishing a pattern of similarities and differences among them from a trilingual perspective. The analysis shows that much of the studied expressions are really structurally, semantically and metaphorically identical or similar in the three languages. These findings incite relevant discussions concerning mother and foreign language learning and aim to contribute to the teaching of phraseological Lexicon as well as to materials development in mono and multilingual perspectives.

Keywords: idiomatic expressions, materials development, metaphors, phraseological lexicon, teaching and learning

Procedia PDF Downloads 111
59 The Grammatical Dictionary Compiler: A System for Kartvelian Languages

Authors: Liana Lortkipanidze, Nino Amirezashvili, Nino Javashvili

Abstract:

The purpose of the grammatical dictionary is to provide information on the morphological and syntactic characteristics of the basic word in the dictionary entry. The electronic grammatical dictionaries are used as a tool of automated morphological analysis for texts processing. The Georgian Grammatical Dictionary should contain grammatical information for each word: part of speech, type of declension/conjugation, grammatical forms of the word (paradigm), alternative variants of basic word/lemma. In this paper, we present the system for compiling the Georgian Grammatical Dictionary automatically. We propose dictionary-based methods for extending grammatical lexicons. The input lexicon contains only a few number of words with identical grammatical features. The extension is based on similarity measures between features of words; more precisely, we add words to the extended lexicons, which are similar to those, which are already in the grammatical dictionary. Our dictionaries are corpora-based, and for the compiling, we introduce the method for lemmatization of unknown words, i.e., words of which neither full form nor lemma is in the grammatical dictionary.

Keywords: acquisition of lexicon, Georgian grammatical dictionary, lemmatization rules, morphological processor

Procedia PDF Downloads 69
58 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor

Authors: Tayyaba Azim, Bibi Amina

Abstract:

The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.

Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec

Procedia PDF Downloads 81
57 Investigating the Associative Network of Color Terms among Turkish University Students: A Cognitive-Based Study

Authors: R. Güçlü, E. Küçüksakarya

Abstract:

Word association (WA) gives the broadest information on how knowledge is structured in the human mind. Cognitive linguistics, psycholinguistics, and applied linguistics are the disciplines that consider WA tests as substantial in gaining insights into the very nature of the human cognitive system and semantic knowledge. In this study, Berlin and Kay’s basic 11 color terms (1969) are presented as the stimuli words to a total number of 300 Turkish university students. The responses are analyzed according to Fitzpatrick’s model (2007), including four categories, namely meaning-based responses, position-based responses, form-based responses, and erratic responses. In line with the findings, the responses to free association tests are expected to give much information about Turkish university students’ psychological structuring of vocabulary, especially morpho-syntactic and semantic relationships among words. To conclude, theoretical and practical implications are discussed to make an in-depth evaluation of how associations of basic color terms are represented in the mental lexicon of Turkish university students.

Keywords: color term, gender, mental lexicon, word association task

Procedia PDF Downloads 53
56 Sociolinguistic Aspects and Language Contact, Lexical Consequences in Francoprovençal Settings

Authors: Carmela Perta

Abstract:

In Italy the coexistence of standard language, its varieties and different minority languages - historical and migration languages - has been a way to study language contact in different directions; the focus of most of the studies is either the relations among the languages of the social repertoire, or the study of contact phenomena occurring in a particular structural level. However, studies on contact facts in relation to a given sociolinguistic situation of the speech community are still not present in literature. As regard the language level to investigate from the perspective of contact, it is commonly claimed that the lexicon is the most volatile part of language and most likely to undergo change due to superstrate influence, indeed first lexical features are borrowed, then, under long term cultural pressure, structural features may also be borrowed. The aim of this paper is to analyse language contact in two historical minority communities where Francoprovençal is spoken, in relation to their sociolinguistic situation. In this perspective, firstly lexical borrowings present in speakers’ speech production will be examined, trying to find a possible correlation between this part of the lexicon and informants’ sociolinguistic variables; secondly a possible correlation between a particular community sociolinguistic situation and lexical borrowing will be found. Methods used to collect data are based on the results obtained from 24 speakers in both the villages; the speaker group in the two communities consisted of 3 males and 3 females in each of four age groups, ranging in age from 9 to 85, and then divided into five groups according to their occupations. Speakers were asked to describe a sequence of pictures naming common objects and then describing scenes when they used these objects: they are common objects, frequently pronounced and belonging to semantic areas which are usually resistant and which are thought to survive. A subset of this task, involving 19 items with Italian source is examined here: in order to determine the significance of the independent variables (social factors) on the dependent variable (lexical variation) the statistical package SPSS, particularly the linear regression, was used.

Keywords: borrowing, Francoprovençal, language change, lexicon

Procedia PDF Downloads 305
55 Corpus-Based Description of Core English Nouns of Pakistani English, an EFL Learner Perspective at Secondary Level

Authors: Abrar Hussain Qureshi

Abstract:

Vocabulary has been highlighted as a key indicator in any foreign language learning program, especially English as a foreign language (EFL). It is often considered a potential tool in foreign language curriculum, and its deficiency impedes successful communication in the target language. The knowledge of the lexicon is very significant in getting communicative competence and performance. Nouns constitute a considerable bulk of English vocabulary. Rather, they are the bones of the English language and are the main semantic carrier in spoken and written discourse. As nouns dominate the bulk of the English lexicon, their role becomes all the more potential. The undertaken research is a systematic effort in this regard to work out a list of highly frequent list of Pakistani English nouns for the EFL learners at the secondary level. It will encourage autonomy for the EFL learners as well as will save their time. The corpus used for the research has been developed locally from leading English newspapers of Pakistan. Wordsmith Tools has been used to process the research data and to retrieve word list of frequent Pakistani English nouns. The retrieved list of core Pakistani English nouns is supposed to be useful for English language learners at the secondary level as it covers a wide range of speech events.

Keywords: corpus, EFL, frequency list, nouns

Procedia PDF Downloads 16
54 Cerrado and Vereda: A Survey of Portuguese Lexicon for Brazilian Biomes

Authors: Daniel Marra

Abstract:

This paper analyses from a semantic-diachronic viewpoint the change of meanings that two lexical items of Brazilian-Portuguese language have gone through. Cerrado and Vereda designate currently the second largest Brazilian biome and one of its most important subsystems. Nevertheless, these two words have long individual histories that can be traced back to their Latin etymons. Therefore, the purpose of this work is to highlight the process by which meaning instantiated itself in these words’ formation and to discuss how semantic change installed subsequently in them. As this paper shows, the aforementioned words have been, in different past, synchronizes, created, and undergone changes of meanings by metaphor and metonymy. Besides, it is argued here that semantic change takes place due to external causes, such as generalization and specialization of meaning. It happens when a specialized use of a lexical item, restricted to a particular linguistic group, is adopted by other groups, having its meaning generalized by them. In these processes, the etymological idea of the word is generally lost, which gains, in the new group, less specific meaning in relation to its etymology, sometimes with no relation to the original idea. As a final point, it is claimed that both the creation of a lexical item and its change of meaning involve pragmatic goals, such as the need the language users have to express a new meaning related to a certain reality in the empirical world.

Keywords: Brazilian biomes, metaphor and metonymy, Portuguese lexicon, semantic change

Procedia PDF Downloads 49
53 The Presence of Anglicisms in Italian Fashion Magazines and Fashion Blogs

Authors: Vivian Orsi

Abstract:

The present research investigates the lexicon of a fashion magazine, whose universe is very receptive to lexical loans, especially those from English, called Anglicisms. Specifically, we intend to discuss the presence of English items and expressions in the Vogue Italia fashion magazine. Besides, we aim to study the anglicisms used in an Italian fashion blog called The Blonde Salad. Within the discussion of fashion blogs and their contributions to scientific studies, we adopt the theories of Lexicology / Lexicography to define Anglicism (BIDERMAN, 2001), and the observation of its prestige in the Italian Language (ROGATO, 2008; BISETTO, 2003). According to the theoretical basis mentioned, we intend to make a brief analysis of the Anglicisms collected from posts of the first year of existence of such fashion blog, emphasizing also the keywords that have the role to encapsulate the content of the text, allowing the reader to retrieve information from the post of the blog. About the use of English in Italian magazines and blogs, we can affirm that it seems to represent sophistication, assuming the value of prerequisite to participate in the fashion centers of the world. Besides, we believe, as Barthes says (1990, p. 215), that “Fashion does not evolve, it changes: its lexicon is new each year, like that of a language which always keeps the same system but suddenly and regularly ‘changes’ the currency of its words”. Fashion is a mode of communication: it is present in man's interaction with the world, which means that such lexical universe is represented according to the particularities of each culture.

Keywords: anglicism, lexicology, magazines, blogs, fashion

Procedia PDF Downloads 256
52 Cross-Language Variation and the ‘Fused’ Zone in Bilingual Mental Lexicon: An Experimental Research

Authors: Yuliya E. Leshchenko, Tatyana S. Ostapenko

Abstract:

Language variation is a widespread linguistic phenomenon which can affect different levels of a language system: phonological, morphological, lexical, syntactic, etc. It is obvious that the scope of possible standard alternations within a particular language is limited by a variety of its norms and regulations which set more or less clear boundaries for what is possible and what is not possible for the speakers. The possibility of lexical variation (alternate usage of lexical items within the same contexts) is based on the fact that the meanings of words are not clearly and rigidly defined in the consciousness of the speakers. Therefore, lexical variation is usually connected with unstable relationship between words and their referents: a case when a particular lexical item refers to different types of referents, or when a particular referent can be named by various lexical items. We assume that the scope of lexical variation in bilingual speech is generally wider than that observed in monolingual speech due to the fact that, besides ‘lexical item – referent’ relations it involves the possibility of cross-language variation of L1 and L2 lexical items. We use the term ‘cross-language variation’ to denote a case when two equivalent words of different languages are treated by a bilingual speaker as freely interchangeable within the common linguistic context. As distinct from code-switching which is traditionally defined as the conscious use of more than one language within one communicative act, in case of cross-language lexical variation the speaker does not perceive the alternate lexical items as belonging to different languages and, therefore, does not realize the change of language code. In the paper, the authors present research of lexical variation of adult Komi-Permyak – Russian bilingual speakers. The two languages co-exist on the territory of the Komi-Permyak District in Russia (Komi-Permyak as the ethnic language and Russian as the official state language), are usually acquired from birth in natural linguistic environment and, according to the data of sociolinguistic surveys, are both identified by the speakers as coordinate mother tongues. The experimental research demonstrated that alternation of Komi-Permyak and Russian words within one utterance/phrase is highly frequent both in speech perception and production. Moreover, our participants estimated cross-language word combinations like ‘маленькая /Russian/ нывка /Komi-Permyak/’ (‘a little girl’) or ‘мунны /Komi-Permyak/ домой /Russian/’ (‘go home’) as regular/habitual, containing no violation of any linguistic rules and being equally possible in speech as the equivalent intra-language word combinations (‘учöтик нывка’ /Komi-Permyak/ or ‘идти домой’ /Russian/). All the facts considered, we claim that constant concurrent use of the two languages results in the fact that a large number of their words tend to be intuitively interpreted by the speakers as lexical variants not only related to the same referent, but also referring to both languages or, more precisely, to none of them in particular. Consequently, we can suppose that bilingual mental lexicon includes an extensive ‘fused’ zone of lexical representations that provide the basis for cross-language variation in bilingual speech.

Keywords: bilingualism, bilingual mental lexicon, code-switching, lexical variation

Procedia PDF Downloads 71
51 Religion and Politeness: An Exploratory Study for the Integration of Religious Expressions with Politeness Strategies in Iraqi Computer-Mediated Communication

Authors: Rasha Alsabbah

Abstract:

This study explores the relationship between polite language use and religion in the Iraqi culture in computer mediated communication. It tackles the speech acts where these expressions are employed, the frequency of their occurrence and the aims behind them. It also investigates if they have equivalent expressions in English and the possibility of translating them in intercultural communication. Despite the wide assumption that language is a reflection of culture and religion, it started to grant the attention sociologists during the recent 40 years when scholars have questioned the possible interconnection between religion and language in which religion is used as a mean of producing language and performing pragmatic functions. It is presumed that Arabs in general, and Iraqis in particular, have an inclination to use religious vocabulary in showing politeness in their greeting and other speech acts. Due to Islamic religion and culture’s influences, it is observed that Iraqis are very much concerned of maintaining social solidarity and harmonious relationships which make religion a politeness strategy that operates as the key point of their social behaviours. In addition, religion has found to influence almost all their interactions in which they have a tendency of invoking religious expressions, the lexicon of Allah (God), and Qur’anic verses in their daily politeness discourse. This aspect of Islamic culture may look strange, especially to people who come from individualist societies, such as England. Data collection in this study is based on messaging applications like Viber, WhatsApp, and Facebook. After gaining the approval of the participants, there was an investigation for the different aims behind these expressions and the pragmatic function that they perform. It is found that Iraqis tend to incorporate the lexicon of Allah in most of their communication. Such employment is not only by religious people but also by individuals who do not show strong commitment to religion. Furthermore, the social distance and social power between people do not play a significant role in increasing or reducing the rate of using these expressions. A number of these expressions, though can be translated into English, do not have one to one counterpart or reflect religious feeling. In addition, they might sound odd upon being translated or transliterated in oral and written communication in intercultural communication.

Keywords: computer mediated communication (CMC), intercultural communication, politeness, religion, situation bound utterances rituals, speech acts

Procedia PDF Downloads 329
50 Saudi Twitter Corpus for Sentiment Analysis

Authors: Adel Assiri, Ahmed Emam, Hmood Al-Dossari

Abstract:

Sentiment analysis (SA) has received growing attention in Arabic language research. However, few studies have yet to directly apply SA to Arabic due to lack of a publicly available dataset for this language. This paper partially bridges this gap due to its focus on one of the Arabic dialects which is the Saudi dialect. This paper presents annotated data set of 4700 for Saudi dialect sentiment analysis with (K= 0.807). Our next work is to extend this corpus and creation a large-scale lexicon for Saudi dialect from the corpus.

Keywords: Arabic, sentiment analysis, Twitter, annotation

Procedia PDF Downloads 450
49 Fine-Grained Sentiment Analysis: Recent Progress

Authors: Jie Liu, Xudong Luo, Pingping Lin, Yifan Fan

Abstract:

Facebook, Twitter, Weibo, and other social media and significant e-commerce sites generate a massive amount of online texts, which can be used to analyse people’s opinions or sentiments for better decision-making. So, sentiment analysis, especially fine-grained sentiment analysis, is a very active research topic. In this paper, we survey various methods for fine-grained sentiment analysis, including traditional sentiment lexicon-based methods, machine learning-based methods, and deep learning-based methods in aspect/target/attribute-based sentiment analysis tasks. Besides, we discuss their advantages and problems worthy of careful studies in the future.

Keywords: sentiment analysis, fine-grained, machine learning, deep learning

Procedia PDF Downloads 127
48 Perception and Control in the Age of Surrealism: A Critical History and a Survey of Pita Amor’s Poetic Ontology

Authors: Oliver Arana

Abstract:

Within the common vein of social understanding, surrealism is often understood to rely on disconcerting images and fragmented collage, both in its visual representation and literary manifestations. By tracing the history and literature of surrealism, the author makes the argument that there were certain factions within Latin America that employed characteristics of surrealism in order to reach some sense of understanding, and not to further complicate or disorient -an aim that most closely aligns to Freudian psychoanalysis. Psychoanalysis should, however, be a comparable practice only to understand how Latin American surrealism had more of a concrete goal than its European counterpart. The primary subject of the paper is the Mexican poet, Pita Amor, who has retroactively been associated with the movement; and therefore, it should be duly noted that the adjective, surrealism, only applies to her as something that describes traits within the literary lexicon.

Keywords: Latin America, Pita Amor, poetry, surrealism

Procedia PDF Downloads 64
47 Issue Reorganization Using the Measure of Relevance

Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim

Abstract:

Recently, the demand of extracting the R&D keywords from the issues and using them in retrieving R&D information is increasing rapidly. But it is hard to identify the related issues or to distinguish them. Although the similarity between the issues cannot be identified, but with the R&D lexicon, the issues that always shared the same R&D keywords can be determined. In details, the R&D keywords that associated with particular issue is implied the key technology elements that needed to solve the problem of the particular issue. Furthermore, the related issues that sharing the same R&D keywords can be showed in a more systematic way through the issue clustering constructed from the perspective of R&D. Thus, sharing of the R&D result and reusable of the R&D technology can be facilitated. Indirectly, the redundancy of investment on the same R&D can be reduce as the R&D information can be shared between those corresponding issues and reusability of the related R&D can be improved. Therefore, a methodology of constructing an issue clustering from the perspective of common R&D keywords is proposed to satisfy the demands mentioned.

Keywords: clustering, social network analysis, text mining, topic analysis

Procedia PDF Downloads 494
46 Comparative between Different Methodological Procedures Used to Obtain Information on the First Lexical Development in Bilingual Basque-Spanish Children

Authors: Asier Romero Andonegi, Irati De Pablo Delgado

Abstract:

The objective of this study is to explore the different methodological procedures that are used to obtain information on the early linguistic development of children. To this end, two different methodological procedures were carried out on the same sample: on the one hand, the MacArthur-Bates Communicative Development Inventories, in its adaptations in Spanish and Basque; and on the other hand, longitudinal observation through professional software: ELAN and CHAT. The sample consists of 8 Basque children/ages 16 to 30 months with different mother tongue (L1). The results show the usefulness of inventories in obtaining information on the development of early communication and language skills, but also their limitations mostly focused on the interpretive overvaluation of their children’s lexical development.

Keywords: early language development, language evaluation, lexicon, MacArthur-Bates communicative development inventories

Procedia PDF Downloads 72
45 Documents Emotions Classification Model Based on TF-IDF Weighting Measure

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Emotions classification of text documents is applied to reveal if the document expresses a determined emotion from its writer. As different supervised methods are previously used for emotion documents’ classification, in this research we present a novel model that supports the classification algorithms for more accurate results by the support of TF-IDF measure. Different experiments have been applied to reveal the applicability of the proposed model, the model succeeds in raising the accuracy percentage according to the determined metrics (precision, recall, and f-measure) based on applying the refinement of the lexicon, integration of lexicons using different perspectives, and applying the TF-IDF weighting measure over the classifying features. The proposed model has also been compared with other research to prove its competence in raising the results’ accuracy.

Keywords: emotion detection, TF-IDF, WEKA tool, classification algorithms

Procedia PDF Downloads 314
44 The Greek Diaspora in Australia: Identity and Transnational Identity

Authors: Panayiota Romios

Abstract:

As the use of 'diaspora' has proliferated in the last decade, its meaning has been stretched in various directions. Current diaspora frames of identity representation do not adequately capture the complexities of everyday lived experiences of transnational individuals and groups. This paper presents the findings of a qualitative research project conducted in Melbourne, Australia with second generation Greek Australians. It analyses the forms of intercultural identities of the second generation Greek Australians returning to Australia post-2008, after living in Greece for an extended period of time. The discussion highlights key characteristics in relation to diaspora-homeland ties, seeking to denaturalise the commonplace assumptions and imaginations about the cultures and identities of Greek Australian diaspora communities and probe the relevance of identity markers such a country of origin, nationality, ethnicity, ethnic origin, language and mother tongue. The definition of diaspora experienced in this transnational lexicon is interestingly quite distinct from original articulations and also from others returning ‘home’.

Keywords: diaspora, identity, migration, displacement

Procedia PDF Downloads 261
43 Grammatically Coded Corpus of Spoken Lithuanian: Methodology and Development

Authors: L. Kamandulytė-Merfeldienė

Abstract:

The paper deals with the main issues of methodology of the Corpus of Spoken Lithuanian which was started to be developed in 2006. At present, the corpus consists of 300,000 grammatically annotated word forms. The creation of the corpus consists of three main stages: collecting the data, the transcription of the recorded data, and the grammatical annotation. Collecting the data was based on the principles of balance and naturality. The recorded speech was transcribed according to the CHAT requirements of CHILDES. The transcripts were double-checked and annotated grammatically using CHILDES. The development of the Corpus of Spoken Lithuanian has led to the constant increase in studies on spontaneous communication, and various papers have dealt with a distribution of parts of speech, use of different grammatical forms, variation of inflectional paradigms, distribution of fillers, syntactic functions of adjectives, the mean length of utterances.

Keywords: CHILDES, corpus of spoken Lithuanian, grammatical annotation, grammatical disambiguation, lexicon, Lithuanian

Procedia PDF Downloads 170