Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 85

Search results for: NRC lexicon

85 Lexicon-Based Sentiment Analysis for Stock Movement Prediction

Authors: Zane Turner, Kevin Labille, Susan Gauch

Abstract:

Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We present a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.

Keywords: computational finance, sentiment analysis, sentiment lexicon, stock movement prediction

Procedia PDF Downloads 90
84 Lexicon-Based Sentiment Analysis for Stock Movement Prediction

Authors: Zane Turner, Kevin Labille, Susan Gauch

Abstract:

Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We introduce a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.

Keywords: computational finance, sentiment analysis, sentiment lexicon, stock movement prediction

Procedia PDF Downloads 133
83 Expressivity of Word-Formation in English and Russian Advertising Lexicon

Authors: Voronina Ekaterina Borisovna

Abstract:

The problem of expressivity of advertising lexicon is studied in the article. The comparison of English and Russian advertising lexicons is done. The objects of the analysis were English and Russian advertising texts, both printed advertising texts and texts extracted from the commercials. Some conclusions concerning the expressivity of advertising lexicon were made. Expressivity can be included in the semantic structure of words or created by word-formation means. Expressivity caused by morphological derivatives includes such facilities as derivational affixes, models and types of word formation.

Keywords: advertising lexicon, expressivity, word-formation means, linguistics

Procedia PDF Downloads 313
82 Arabic Lexicon Learning to Analyze Sentiment in Microblogs

Authors: Mahmoud B. Rokaya

Abstract:

The study of opinion mining and sentiment analysis includes analysis of opinions, sentiments, evaluations, attitudes, and emotions. The rapid growth of social media, social networks, reviews, forum discussions, microblogs, and Twitter, leads to a parallel growth in the field of sentiment analysis. The field of sentiment analysis tries to develop effective tools to make it possible to capture the trends of people. There are two approaches in the field, lexicon-based and corpus-based methods. A lexicon-based method uses a sentiment lexicon which includes sentiment words and phrases with assigned numeric scores. These scores reveal if sentiment phrases are positive or negative, their intensity, and/or their emotional orientations. Creation of manual lexicons is hard. This brings the need for adaptive automated methods for generating a lexicon. The proposed method generates dynamic lexicons based on the corpus and then classifies text using these lexicons. In the proposed method, different approaches are combined to generate lexicons from text. The proposed method classifies the tweets into 5 classes instead of +ve or –ve classes. The sentiment classification problem is written as an optimization problem, finding optimum sentiment lexicons are the goal of the optimization process. The solution was produced based on mathematical programming approaches to find the best lexicon to classify texts. A genetic algorithm was written to find the optimal lexicon. Then, extraction of a meta-level feature was done based on the optimal lexicon. The experiments were conducted on several datasets. Results, in terms of accuracy, recall and F measure, outperformed the state-of-the-art methods proposed in the literature in some of the datasets. A better understanding of the Arabic language and culture of Arab Twitter users and sentiment orientation of words in different contexts can be achieved based on the sentiment lexicons proposed by the algorithm.

Keywords: social media, Twitter sentiment, sentiment analysis, lexicon, genetic algorithm, evolutionary computation

Procedia PDF Downloads 139
81 Variety and the Distribution of the Java Language Lexicon “Sleeping” in Jombang District East Java: Study of Geographic Dialectology

Authors: Krismonika Khoirunnisa

Abstract:

This research article aims to describe the variation of the Javanese lexicon "Sleep " and its distribution in the Jombang area, East Java. The objectives of this study were (1) to classify the variation of the "Sleep" lexicon in the Jombang area and (2) to design the fish rips for the variation of the "Sleep" lexicon according to their distribution. This type of research is a qualitative descriptive study using the method of leading proficiency, namely conducting interviews with speakers without directly meeting the speakers (interviews via WhatsApp and email as the media). This research article uses techniques record as support and tools for mapping and classifying data, collecting data in this study conducted at four points, namely the Kaliwungu village (Jombang City), Banjardowo village (District of Jombang), Mayangan Village (Subdistrict Jogoroto), and Karobelah village (Subdistrict Mojoagung) as a target investigators to conduct the interview. This study uses the dialectology theory as a basis for analyzing the data obtained. The results of this study found that the Javanese language variation "Sleep" has many different linguals, meanings, and forms even though they are in the same area (Jombang).

Keywords: geographical dialectology, lexicon variations, jombangan dialect, sssavanese language

Procedia PDF Downloads 181
80 A Proposed Approach for Emotion Lexicon Enrichment

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.

Keywords: document analysis, sentimental analysis, emotion detection, WEKA tool, NRC lexicon

Procedia PDF Downloads 383
79 Acquisition of Murcian Lexicon and Morphology by L2 Spanish Immigrants: The Role of Social Networks

Authors: Andrea Hernandez Hurtado

Abstract:

Research on social networks (SNs) -- the interactions individuals share with others has shed important light in helping to explain differential use of variable linguistic forms, both in L1s and L2s. Nevertheless, the acquisition of nonstandard L2 Spanish in the Region of Murcia, Spain, and how learners interact with other speakers while sojourning there have received little attention. Murcian Spanish (MuSp) was widely influenced by Panocho, a divergent evolution of Hispanic Latin, and differs from the more standard Peninsular Spanish (StSp) in phonology, morphology, and lexicon. For instance, speakers from this area will most likely palatalize diminutive endings, producing animalico [̩a.ni.ma.ˈli.ko] instead of animalito [̩a.ni.ma.ˈli.to] ‘little animal’. Because L1 speakers of the area produce and prefer salient regional lexicon and morphology (particularly the palatalized diminutive -ico) in their speech, the current research focuses on how international residents in the Region of Murcia use Spanish: (1) whether or not they acquire (perceptively and/or productively) any of the salient regional features of MuSp, and (2) how their SNs explain such acquisition. This study triangulates across three tasks -recognition, production, and preference- addressing both lexicon and morphology, with each task specifically created for the investigation of MuSp features. Among other variables, the effects of L1, residence, and identity are considered. As an ongoing dissertation research, data are currently being gathered through an online questionnaire. So far, 7 participants from multiple nationalities have completed the survey, although a minimum of 25 are expected to be included in the coming months. Preliminary results revealed that MuSp lexicon and morphology were successfully recognized by participants (p<.001). In terms of regional lexicon production (10.0%) and preference (47.5%), although participants showed higher percentages of StSp, results showed that international residents become aware of stigmatized lexicon and may incorporate it into their language use. Similarly, palatalized diminutives (production 14.2%, preference 19.0%) were present in their responses. The Social Network Analysis provided information about participants’ relationships with their interactants, as well as among them. Results indicated that, generally, when residents were more immersed in the culture (i.e., had more Murcian alters) they produced and preferred more regional features. This project contributes to the knowledge of language variation acquisition in L2 speakers, focusing on a stigmatized Spanish dialect and exploring how stigmatized varieties may affect L2 development. Results will show how L2 Spanish speakers’ language is affected by their stay in Murcia. This, in turn, will shed light on the role of SNs in language acquisition, the acquisition of understudied and marginalized varieties, and the role of immersion on language acquisition. As the first systematic account on the acquisition of L2 Spanish lexicon and morphology in the Region of Murcia, it lays important groundwork for further research on the connection between SNs and the acquisition of regional variants, applicable to Murcia and beyond.

Keywords: international residents, L2 Spanish, lexicon, morphology, nonstandard language acquisition, social networks

Procedia PDF Downloads 26
78 Descriptive Analysis of Variations in Maguindanaon Language

Authors: Fhajema Kunso

Abstract:

People who live in the same region and who seemed to speak the same language still vary in some aspects of their language. The variation may occur in terms of pronunciation, lexicon, morphology, and syntax. This qualitative study described the phonological, morphological, and lexical variations of the Maguindanaon language among the ten Maguindanao municipalities. Purposive sampling, in-depth interviews, focus group discussion, and sorting and classifying of words according to phonological and morphological as well as lexical structures in data analysis were employed. The variations occurred through phonemic changes and other phonological processes and morphological processes. Phonological processes consisted of vowel lengthening and deletion while morphological processes included affixation, borrowing, and coinage. In the phonological variation, it was observed that there were phonemic changes in one dialect to another. For example, there was a change of phoneme /r/ to /l/. The phoneme /r/ was most likely to occur in Kabuntalan like /biru/, /kurIt/, and /kɘmɅr/ whereas in the rest of the dialects these were /bilu/, /kuIɪt/, and /kɘmɅl/ respectively. Morphologically, the affixation was the main way to know the tenses. For example, the root sarig (expect) when inserted with im becomes simarig, i.e. s + im + arig = simarig (expected). Lexical variation also existed in the Maguindanaon language. Results revealed that the variation in phonology, morphology, and lexicon were observed to be associated primarily on geographic distribution.

Keywords: applied linguistics, language, lexicon, Maguindanao, morphology, Philippines, phonology, processes, qualitative, variation

Procedia PDF Downloads 345
77 Receptive Vocabulary Development in Adolescents and Adults with Down Syndrome

Authors: Esther Moraleda Sepúlveda, Soraya Delgado Matute, Paula Salido Escudero, Raquel Mimoso García, M Cristina Alcón Lancho

Abstract:

Although there is some consensus when it comes to establishing the lexicon as one of the strengths of language in people with Down Syndrome (DS), little is known about its evolution throughout development and changes based on age. The objective of this study was to find out if there are differences in receptive vocabulary between adolescence and adulthood. In this research, 30 people with DS between 11 and 40 years old, divided into two age ranges (11-18; 19 - 30) and matched in mental age, were evaluated through the Peabody Vocabulary Test. The results show significant differences between both groups in favor of the group with the oldest chronological age and a direct correlation between chronological age and receptive vocabulary development, regardless of mental age. These data support the natural evolution of the passive lexicon in people with DS.

Keywords: down syndrome, language, receptive vocabulary, adolescents, adults

Procedia PDF Downloads 157
76 The EFL Mental Lexicon: Connectivity and the Acquisition of Lexical Knowledge Depth

Authors: Khalid Soussi

Abstract:

The study at hand has attempted to describe the acquisition of three EFL lexical knowledge aspects - meaning, synonymy and collocation – across three academic levels: Baccalaureate, second year and fourth year university levels in Morocco. The research also compares the development of the three lexical knowledge aspects between knowledge (reception) and use (production) and attempts to trace their order of acquisition. This has led to the use of three main data collection tasks: translation, acceptability judgment and multiple choices. The study has revealed the following findings. First, L1 and EFL mental lexicons are connected at the lexical knowledge depth. Second, such connection is active whether in language reception or use. Third, the connectivity between L1 and EFL mental lexicons tends to relatively decrease as the academic level of the learners increases. Finally, the research has revealed a significant 'order' of acquisition between the three lexical aspects, though not a very strong one.

Keywords: vocabulary acquisition, EFL lexical knowledge, mental lexicon, vocabulary knowledge depth

Procedia PDF Downloads 250
75 Unsupervised Sentiment Analysis for Indonesian Political Message on Twitter

Authors: Omar Abdillah, Mirna Adriani

Abstract:

In this work, we perform new approach for analyzing public sentiment towards the presidential candidate in the 2014 Indonesian election that expressed in Twitter. In this study we propose such procedure for analyzing sentiment over Indonesian political message by understanding the behavior of Indonesian society in sending message on Twitter. We took different approach from previous works by utilizing punctuation mark and Indonesian sentiment lexicon that completed with the new procedure in determining sentiment towards the candidates. Our experiment shows the performance that yields up to 83.31% of average precision. In brief, this work makes two contributions: first, this work is the preliminary study of sentiment analysis in the domain of political message that has not been addressed yet before. Second, we propose such method to conduct sentiment analysis by creating decision making procedure in which it is in line with the characteristic of Indonesian message on Twitter.

Keywords: unsupervised sentiment analysis, political message, lexicon based, user behavior understanding

Procedia PDF Downloads 439
74 BiLex-Kids: A Bilingual Word Database for Children 5-13 Years Old

Authors: Aris R. Terzopoulos, Georgia Z. Niolaki, Lynne G. Duncan, Mark A. J. Wilson, Antonios Kyparissiadis, Jackie Masterson

Abstract:

As word databases for bilingual children are not available, researchers, educators and textbook writers must rely on monolingual databases. The aim of this study is thus to develop a bilingual word database, BiLex-kids, an online open access developmental word database for 5-13 year old bilingual children who learn Greek as a second language and have English as their dominant one. BiLex-kids is compiled from 120 Greek textbooks used in Greek-English bilingual education in the UK, USA and Australia, and provides word translations in the two languages, pronunciations in Greek, and psycholinguistic variables (e.g. Zipf, Frequency per million, Dispersion, Contextual Diversity, Neighbourhood size). After clearing the textbooks of non-relevant items (e.g. punctuation), algorithms were applied to extract the psycholinguistic indices for all words. As well as one total lexicon, the database produces values for all ages (one lexicon for each age) and for three age bands (one lexicon per age band: 5-8, 9-11, 12-13 years). BiLex-kids provides researchers with accurate figures for a wide range of psycholinguistic variables, making it a useful and reliable research tool for selecting stimuli to examine lexical processing among bilingual children. In addition, it offers children the opportunity to study word spelling, learn translations and listen to pronunciations in their second language. It further benefits educators in selecting age-appropriate words for teaching reading and spelling, while special educational needs teachers will have a resource to control the content of word lists when designing interventions for bilinguals with literacy difficulties.

Keywords: bilingual children, psycholinguistics, vocabulary development, word databases

Procedia PDF Downloads 271
73 Anglicisms in the Magazine Glamour France: The Influence of English on the French Language of Fashion

Authors: Vivian Orsi

Abstract:

In this research, we aim to investigate the lexicon of women's magazines, with special attention to fashion, whose universe is very receptive to lexical borrowings, especially those from English, called Anglicisms. Thus, we intend to discuss the presence of English items and expressions on the online French women's magazine Glamour France collected from six months. Highlighting the quantitative aspects of the use of English in that publication, we can affirm that the use of those lexical borrowings seems to represent sophistication to attract readers and identification with other cultures, establishing communication and intensifying the language of fashion. The potential for creativity in fashion lexicon is made possible by its permeability to social and linguistic phenomena across all social classes that allow constant manipulation of genuine borrowings. Besides, it seems to assume the value of prerequisite to participate in the fashion centers of the world. The use of Anglicisms in Glamour France is not limited to designate concepts and fashionable items that have no equivalent in French, but it acts as a kind of seduction tool, which uses the symbolic capital of English as the global language of communication.

Keywords: Anglicisms, lexicology, borrowings, fashion language

Procedia PDF Downloads 242
72 Metaphors Underlying Idiomatic Expressions in Trilingual Perspective: Contributions to the Teaching of Lexicon and to Materials Development

Authors: Marilei Amadeu Sabino

Abstract:

Idiomatic expressions are linguistic phraseologisms present in natural languages. Known to be metaphorical linguistic combinations, a good majority of them provide elements that reveal important cultural aspects of their linguistic community through their metaphors. With the advent of Cognitive Linguistics (more specifically of Cognitive Semantics), the metaphor ceased to be related to poetic language and rhetorical embellishment and came to be seen as part of simple everyday language, reflecting the way human beings think, act and conceive reality, i. e., a fundamental mechanism of human conceptualizations of the world. In this sense, it came to be conceived as an inevitable mechanism for representing the nature of thought and language. The speakers, in conceptualizing reality, often use metaphorically parts of the body in expressions known as somatic. Several conceptual metaphors appear to be potentially universal or near-universal, because people across the world share certain bodily experiences. In these terms, many linguistic metaphors may be identical or very similar in several languages. These similarities, according to the Theory of Conceptual Metaphor, derive from universal aspects of the human body. Thus, this research aims to investigate the nature of some metaphors underlying somatic idiomatic expressions of Portuguese, Italian and English languages, establishing a pattern of similarities and differences among them from a trilingual perspective. The analysis shows that much of the studied expressions are really structurally, semantically and metaphorically identical or similar in the three languages. These findings incite relevant discussions concerning mother and foreign language learning and aim to contribute to the teaching of phraseological Lexicon as well as to materials development in mono and multilingual perspectives.

Keywords: idiomatic expressions, materials development, metaphors, phraseological lexicon, teaching and learning

Procedia PDF Downloads 149
71 Agents and Causers in the Experiencer-Verb Lexicon

Authors: Margaret Ryan, Linda Cupples, Lyndsey Nickels, Paul Sowman

Abstract:

The current investigation explored the thematic roles of the nouns specified in the lexical entries of experiencer verbs. While prior experimental research assumes experiencer and theme roles for both subject-experiencer (SE) and object-experiencer (OE) verbs, syntactic theorists have posited additional agent and causer roles. Experiment 1 provided evidence for an agent as participants assigned a high degree of intentionality to the logical subject of a subset of SE and OE actives and passives. Experiment 2 provided evidence for a causer as participants assigned high levels of causality to the logical subjects of experiencer sentences generally. However, the presence of an agent, but not a causer, coincided with processing ease. Causality may be an aspect rather than a thematic role. The varying thematic roles amongst experiencer-verb sentences have important implications for stimulus selection because we cannot presume processing is similar across differing sentence subtypes.

Keywords: sentence comprehension, lexicon, canonicity, processing, thematic roles, syntax

Procedia PDF Downloads 72
70 The Grammatical Dictionary Compiler: A System for Kartvelian Languages

Authors: Liana Lortkipanidze, Nino Amirezashvili, Nino Javashvili

Abstract:

The purpose of the grammatical dictionary is to provide information on the morphological and syntactic characteristics of the basic word in the dictionary entry. The electronic grammatical dictionaries are used as a tool of automated morphological analysis for texts processing. The Georgian Grammatical Dictionary should contain grammatical information for each word: part of speech, type of declension/conjugation, grammatical forms of the word (paradigm), alternative variants of basic word/lemma. In this paper, we present the system for compiling the Georgian Grammatical Dictionary automatically. We propose dictionary-based methods for extending grammatical lexicons. The input lexicon contains only a few number of words with identical grammatical features. The extension is based on similarity measures between features of words; more precisely, we add words to the extended lexicons, which are similar to those, which are already in the grammatical dictionary. Our dictionaries are corpora-based, and for the compiling, we introduce the method for lemmatization of unknown words, i.e., words of which neither full form nor lemma is in the grammatical dictionary.

Keywords: acquisition of lexicon, Georgian grammatical dictionary, lemmatization rules, morphological processor

Procedia PDF Downloads 106
69 Linguistic Analysis of Argumentation Structures in Georgian Political Speeches

Authors: Mariam Matiashvili

Abstract:

Argumentation is an integral part of our daily communications - formal or informal. Argumentative reasoning, techniques, and language tools are used both in personal conversations and in the business environment. Verbalization of the opinions requires the use of extraordinary syntactic-pragmatic structural quantities - arguments that add credibility to the statement. The study of argumentative structures allows us to identify the linguistic features that make the text argumentative. Knowing what elements make up an argumentative text in a particular language helps the users of that language improve their skills. Also, natural language processing (NLP) has become especially relevant recently. In this context, one of the main emphases is on the computational processing of argumentative texts, which will enable the automatic recognition and analysis of large volumes of textual data. The research deals with the linguistic analysis of the argumentative structures of Georgian political speeches - particularly the linguistic structure, characteristics, and functions of the parts of the argumentative text - claims, support, and attack statements. The research aims to describe the linguistic cues that give the sentence a judgmental/controversial character and helps to identify reasoning parts of the argumentative text. The empirical data comes from the Georgian Political Corpus, particularly TV debates. Consequently, the texts are of a dialogical nature, representing a discussion between two or more people (most often between a journalist and a politician). The research uses the following approaches to identify and analyze the argumentative structures Lexical Classification & Analysis - Identify lexical items that are relevant in argumentative texts creating process - Creating the lexicon of argumentation (presents groups of words gathered from a semantic point of view); Grammatical Analysis and Classification - means grammatical analysis of the words and phrases identified based on the arguing lexicon. Argumentation Schemas - Describe and identify the Argumentation Schemes that are most likely used in Georgian Political Speeches. As a final step, we analyzed the relations between the above mentioned components. For example, If an identified argument scheme is “Argument from Analogy”, identified lexical items semantically express analogy too, and they are most likely adverbs in Georgian. As a result, we created the lexicon with the words that play a significant role in creating Georgian argumentative structures. Linguistic analysis has shown that verbs play a crucial role in creating argumentative structures.

Keywords: georgian, argumentation schemas, argumentation structures, argumentation lexicon

Procedia PDF Downloads 36
68 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor

Authors: Tayyaba Azim, Bibi Amina

Abstract:

The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.

Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec

Procedia PDF Downloads 112
67 Investigating the Associative Network of Color Terms among Turkish University Students: A Cognitive-Based Study

Authors: R. Güçlü, E. Küçüksakarya

Abstract:

Word association (WA) gives the broadest information on how knowledge is structured in the human mind. Cognitive linguistics, psycholinguistics, and applied linguistics are the disciplines that consider WA tests as substantial in gaining insights into the very nature of the human cognitive system and semantic knowledge. In this study, Berlin and Kay’s basic 11 color terms (1969) are presented as the stimuli words to a total number of 300 Turkish university students. The responses are analyzed according to Fitzpatrick’s model (2007), including four categories, namely meaning-based responses, position-based responses, form-based responses, and erratic responses. In line with the findings, the responses to free association tests are expected to give much information about Turkish university students’ psychological structuring of vocabulary, especially morpho-syntactic and semantic relationships among words. To conclude, theoretical and practical implications are discussed to make an in-depth evaluation of how associations of basic color terms are represented in the mental lexicon of Turkish university students.

Keywords: color term, gender, mental lexicon, word association task

Procedia PDF Downloads 81
66 Sociolinguistic Aspects and Language Contact, Lexical Consequences in Francoprovençal Settings

Authors: Carmela Perta

Abstract:

In Italy the coexistence of standard language, its varieties and different minority languages - historical and migration languages - has been a way to study language contact in different directions; the focus of most of the studies is either the relations among the languages of the social repertoire, or the study of contact phenomena occurring in a particular structural level. However, studies on contact facts in relation to a given sociolinguistic situation of the speech community are still not present in literature. As regard the language level to investigate from the perspective of contact, it is commonly claimed that the lexicon is the most volatile part of language and most likely to undergo change due to superstrate influence, indeed first lexical features are borrowed, then, under long term cultural pressure, structural features may also be borrowed. The aim of this paper is to analyse language contact in two historical minority communities where Francoprovençal is spoken, in relation to their sociolinguistic situation. In this perspective, firstly lexical borrowings present in speakers’ speech production will be examined, trying to find a possible correlation between this part of the lexicon and informants’ sociolinguistic variables; secondly a possible correlation between a particular community sociolinguistic situation and lexical borrowing will be found. Methods used to collect data are based on the results obtained from 24 speakers in both the villages; the speaker group in the two communities consisted of 3 males and 3 females in each of four age groups, ranging in age from 9 to 85, and then divided into five groups according to their occupations. Speakers were asked to describe a sequence of pictures naming common objects and then describing scenes when they used these objects: they are common objects, frequently pronounced and belonging to semantic areas which are usually resistant and which are thought to survive. A subset of this task, involving 19 items with Italian source is examined here: in order to determine the significance of the independent variables (social factors) on the dependent variable (lexical variation) the statistical package SPSS, particularly the linear regression, was used.

Keywords: borrowing, Francoprovençal, language change, lexicon

Procedia PDF Downloads 333
65 Corpus-Based Description of Core English Nouns of Pakistani English, an EFL Learner Perspective at Secondary Level

Authors: Abrar Hussain Qureshi

Abstract:

Vocabulary has been highlighted as a key indicator in any foreign language learning program, especially English as a foreign language (EFL). It is often considered a potential tool in foreign language curriculum, and its deficiency impedes successful communication in the target language. The knowledge of the lexicon is very significant in getting communicative competence and performance. Nouns constitute a considerable bulk of English vocabulary. Rather, they are the bones of the English language and are the main semantic carrier in spoken and written discourse. As nouns dominate the bulk of the English lexicon, their role becomes all the more potential. The undertaken research is a systematic effort in this regard to work out a list of highly frequent list of Pakistani English nouns for the EFL learners at the secondary level. It will encourage autonomy for the EFL learners as well as will save their time. The corpus used for the research has been developed locally from leading English newspapers of Pakistan. Wordsmith Tools has been used to process the research data and to retrieve word list of frequent Pakistani English nouns. The retrieved list of core Pakistani English nouns is supposed to be useful for English language learners at the secondary level as it covers a wide range of speech events.

Keywords: corpus, EFL, frequency list, nouns

Procedia PDF Downloads 57
64 Cerrado and Vereda: A Survey of Portuguese Lexicon for Brazilian Biomes

Authors: Daniel Marra

Abstract:

This paper analyses from a semantic-diachronic viewpoint the change of meanings that two lexical items of Brazilian-Portuguese language have gone through. Cerrado and Vereda designate currently the second largest Brazilian biome and one of its most important subsystems. Nevertheless, these two words have long individual histories that can be traced back to their Latin etymons. Therefore, the purpose of this work is to highlight the process by which meaning instantiated itself in these words’ formation and to discuss how semantic change installed subsequently in them. As this paper shows, the aforementioned words have been, in different past, synchronizes, created, and undergone changes of meanings by metaphor and metonymy. Besides, it is argued here that semantic change takes place due to external causes, such as generalization and specialization of meaning. It happens when a specialized use of a lexical item, restricted to a particular linguistic group, is adopted by other groups, having its meaning generalized by them. In these processes, the etymological idea of the word is generally lost, which gains, in the new group, less specific meaning in relation to its etymology, sometimes with no relation to the original idea. As a final point, it is claimed that both the creation of a lexical item and its change of meaning involve pragmatic goals, such as the need the language users have to express a new meaning related to a certain reality in the empirical world.

Keywords: Brazilian biomes, metaphor and metonymy, Portuguese lexicon, semantic change

Procedia PDF Downloads 82
63 The Presence of Anglicisms in Italian Fashion Magazines and Fashion Blogs

Authors: Vivian Orsi

Abstract:

The present research investigates the lexicon of a fashion magazine, whose universe is very receptive to lexical loans, especially those from English, called Anglicisms. Specifically, we intend to discuss the presence of English items and expressions in the Vogue Italia fashion magazine. Besides, we aim to study the anglicisms used in an Italian fashion blog called The Blonde Salad. Within the discussion of fashion blogs and their contributions to scientific studies, we adopt the theories of Lexicology / Lexicography to define Anglicism (BIDERMAN, 2001), and the observation of its prestige in the Italian Language (ROGATO, 2008; BISETTO, 2003). According to the theoretical basis mentioned, we intend to make a brief analysis of the Anglicisms collected from posts of the first year of existence of such fashion blog, emphasizing also the keywords that have the role to encapsulate the content of the text, allowing the reader to retrieve information from the post of the blog. About the use of English in Italian magazines and blogs, we can affirm that it seems to represent sophistication, assuming the value of prerequisite to participate in the fashion centers of the world. Besides, we believe, as Barthes says (1990, p. 215), that “Fashion does not evolve, it changes: its lexicon is new each year, like that of a language which always keeps the same system but suddenly and regularly ‘changes’ the currency of its words”. Fashion is a mode of communication: it is present in man's interaction with the world, which means that such lexical universe is represented according to the particularities of each culture.

Keywords: anglicism, lexicology, magazines, blogs, fashion

Procedia PDF Downloads 290
62 Cross-Language Variation and the ‘Fused’ Zone in Bilingual Mental Lexicon: An Experimental Research

Authors: Yuliya E. Leshchenko, Tatyana S. Ostapenko

Abstract:

Language variation is a widespread linguistic phenomenon which can affect different levels of a language system: phonological, morphological, lexical, syntactic, etc. It is obvious that the scope of possible standard alternations within a particular language is limited by a variety of its norms and regulations which set more or less clear boundaries for what is possible and what is not possible for the speakers. The possibility of lexical variation (alternate usage of lexical items within the same contexts) is based on the fact that the meanings of words are not clearly and rigidly defined in the consciousness of the speakers. Therefore, lexical variation is usually connected with unstable relationship between words and their referents: a case when a particular lexical item refers to different types of referents, or when a particular referent can be named by various lexical items. We assume that the scope of lexical variation in bilingual speech is generally wider than that observed in monolingual speech due to the fact that, besides ‘lexical item – referent’ relations it involves the possibility of cross-language variation of L1 and L2 lexical items. We use the term ‘cross-language variation’ to denote a case when two equivalent words of different languages are treated by a bilingual speaker as freely interchangeable within the common linguistic context. As distinct from code-switching which is traditionally defined as the conscious use of more than one language within one communicative act, in case of cross-language lexical variation the speaker does not perceive the alternate lexical items as belonging to different languages and, therefore, does not realize the change of language code. In the paper, the authors present research of lexical variation of adult Komi-Permyak – Russian bilingual speakers. The two languages co-exist on the territory of the Komi-Permyak District in Russia (Komi-Permyak as the ethnic language and Russian as the official state language), are usually acquired from birth in natural linguistic environment and, according to the data of sociolinguistic surveys, are both identified by the speakers as coordinate mother tongues. The experimental research demonstrated that alternation of Komi-Permyak and Russian words within one utterance/phrase is highly frequent both in speech perception and production. Moreover, our participants estimated cross-language word combinations like ‘маленькая /Russian/ нывка /Komi-Permyak/’ (‘a little girl’) or ‘мунны /Komi-Permyak/ домой /Russian/’ (‘go home’) as regular/habitual, containing no violation of any linguistic rules and being equally possible in speech as the equivalent intra-language word combinations (‘учöтик нывка’ /Komi-Permyak/ or ‘идти домой’ /Russian/). All the facts considered, we claim that constant concurrent use of the two languages results in the fact that a large number of their words tend to be intuitively interpreted by the speakers as lexical variants not only related to the same referent, but also referring to both languages or, more precisely, to none of them in particular. Consequently, we can suppose that bilingual mental lexicon includes an extensive ‘fused’ zone of lexical representations that provide the basis for cross-language variation in bilingual speech.

Keywords: bilingualism, bilingual mental lexicon, code-switching, lexical variation

Procedia PDF Downloads 105
61 Religion and Politeness: An Exploratory Study for the Integration of Religious Expressions with Politeness Strategies in Iraqi Computer-Mediated Communication

Authors: Rasha Alsabbah

Abstract:

This study explores the relationship between polite language use and religion in the Iraqi culture in computer mediated communication. It tackles the speech acts where these expressions are employed, the frequency of their occurrence and the aims behind them. It also investigates if they have equivalent expressions in English and the possibility of translating them in intercultural communication. Despite the wide assumption that language is a reflection of culture and religion, it started to grant the attention sociologists during the recent 40 years when scholars have questioned the possible interconnection between religion and language in which religion is used as a mean of producing language and performing pragmatic functions. It is presumed that Arabs in general, and Iraqis in particular, have an inclination to use religious vocabulary in showing politeness in their greeting and other speech acts. Due to Islamic religion and culture’s influences, it is observed that Iraqis are very much concerned of maintaining social solidarity and harmonious relationships which make religion a politeness strategy that operates as the key point of their social behaviours. In addition, religion has found to influence almost all their interactions in which they have a tendency of invoking religious expressions, the lexicon of Allah (God), and Qur’anic verses in their daily politeness discourse. This aspect of Islamic culture may look strange, especially to people who come from individualist societies, such as England. Data collection in this study is based on messaging applications like Viber, WhatsApp, and Facebook. After gaining the approval of the participants, there was an investigation for the different aims behind these expressions and the pragmatic function that they perform. It is found that Iraqis tend to incorporate the lexicon of Allah in most of their communication. Such employment is not only by religious people but also by individuals who do not show strong commitment to religion. Furthermore, the social distance and social power between people do not play a significant role in increasing or reducing the rate of using these expressions. A number of these expressions, though can be translated into English, do not have one to one counterpart or reflect religious feeling. In addition, they might sound odd upon being translated or transliterated in oral and written communication in intercultural communication.

Keywords: computer mediated communication (CMC), intercultural communication, politeness, religion, situation bound utterances rituals, speech acts

Procedia PDF Downloads 358
60 Saudi Twitter Corpus for Sentiment Analysis

Authors: Adel Assiri, Ahmed Emam, Hmood Al-Dossari

Abstract:

Sentiment analysis (SA) has received growing attention in Arabic language research. However, few studies have yet to directly apply SA to Arabic due to lack of a publicly available dataset for this language. This paper partially bridges this gap due to its focus on one of the Arabic dialects which is the Saudi dialect. This paper presents annotated data set of 4700 for Saudi dialect sentiment analysis with (K= 0.807). Our next work is to extend this corpus and creation a large-scale lexicon for Saudi dialect from the corpus.

Keywords: Arabic, sentiment analysis, Twitter, annotation

Procedia PDF Downloads 581
59 Fine-Grained Sentiment Analysis: Recent Progress

Authors: Jie Liu, Xudong Luo, Pingping Lin, Yifan Fan

Abstract:

Facebook, Twitter, Weibo, and other social media and significant e-commerce sites generate a massive amount of online texts, which can be used to analyse people’s opinions or sentiments for better decision-making. So, sentiment analysis, especially fine-grained sentiment analysis, is a very active research topic. In this paper, we survey various methods for fine-grained sentiment analysis, including traditional sentiment lexicon-based methods, machine learning-based methods, and deep learning-based methods in aspect/target/attribute-based sentiment analysis tasks. Besides, we discuss their advantages and problems worthy of careful studies in the future.

Keywords: sentiment analysis, fine-grained, machine learning, deep learning

Procedia PDF Downloads 207
58 Perception and Control in the Age of Surrealism: A Critical History and a Survey of Pita Amor’s Poetic Ontology

Authors: Oliver Arana

Abstract:

Within the common vein of social understanding, surrealism is often understood to rely on disconcerting images and fragmented collage, both in its visual representation and literary manifestations. By tracing the history and literature of surrealism, the author makes the argument that there were certain factions within Latin America that employed characteristics of surrealism in order to reach some sense of understanding, and not to further complicate or disorient -an aim that most closely aligns to Freudian psychoanalysis. Psychoanalysis should, however, be a comparable practice only to understand how Latin American surrealism had more of a concrete goal than its European counterpart. The primary subject of the paper is the Mexican poet, Pita Amor, who has retroactively been associated with the movement; and therefore, it should be duly noted that the adjective, surrealism, only applies to her as something that describes traits within the literary lexicon.

Keywords: Latin America, Pita Amor, poetry, surrealism

Procedia PDF Downloads 103
57 Issue Reorganization Using the Measure of Relevance

Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim

Abstract:

Recently, the demand of extracting the R&D keywords from the issues and using them in retrieving R&D information is increasing rapidly. But it is hard to identify the related issues or to distinguish them. Although the similarity between the issues cannot be identified, but with the R&D lexicon, the issues that always shared the same R&D keywords can be determined. In details, the R&D keywords that associated with particular issue is implied the key technology elements that needed to solve the problem of the particular issue. Furthermore, the related issues that sharing the same R&D keywords can be showed in a more systematic way through the issue clustering constructed from the perspective of R&D. Thus, sharing of the R&D result and reusable of the R&D technology can be facilitated. Indirectly, the redundancy of investment on the same R&D can be reduce as the R&D information can be shared between those corresponding issues and reusability of the related R&D can be improved. Therefore, a methodology of constructing an issue clustering from the perspective of common R&D keywords is proposed to satisfy the demands mentioned.

Keywords: clustering, social network analysis, text mining, topic analysis

Procedia PDF Downloads 539
56 Elucidating Human Cognition through BERT and Corpus Analysis

Authors: Hisanori Iijima

Abstract:

The central objective of this research is to re-assess human cognition pertaining to the English lexicon, placing a substantial emphasis on polysemous words -words that hold multiple-related meanings in a single-word form. In the continuously evolving field of cognitive linguistic studies, a comparative analysis has been undertaken between statistical evaluations of both written and spoken corpora, most notably the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA), and Google Inc.'s BERT(Bidirectional Encoder Representations from Transformers), a renowned tool within the realm of Natural Language Processing (NLP). Historical empirical investigations in this domain have exhibited a strong inclination towards the utilization of questionnaire surveys to interpret human comprehension mechanisms related to polysemy. These methodologies, although comprehensive, present inherent limitations. Specifically, their predominant reliance on human cognitive perspectives may not offer a holistic view, potentially sidelining the intricate relationships and similarities that exist between individual word meanings within a lexicon. To enhance the depth and breadth of this analysis, the present study has adopted an innovative approach by leveraging the capabilities of BERT, a tool developed by Google Inc. Aimed at offering a fresh perspective; the study utilized BERT to construct a scatterplot. This scatterplot was meticulously designed to highlight the semantic distances between selected polysemous words. Words such as "have" and "take," which have been traditionally recognized to possess multifaceted meanings depending on their usage, were placed under scrutiny. The overarching objective of this phase was to quantitatively assess and represent the semantic proximity or divergence of each word's meaning using a detailed diagrammatic representation. Upon careful examination and analysis, preliminary findings suggest a discernible divergence. The semantic distances, as outlined by the BERT-based scatterplot, demonstrated disparities when juxtaposed against the distances derived from the combined insights of the corpora and human cognition. This observation bears significant implications for the field. In conclusion, this extensive research underscores a pivotal revelation: the intricacies of human cognitive mechanisms may not always align seamlessly with established linguistic frameworks or realities. This, in turn, challenges the prevalent linguistic paradigms, emphasizing that constructs such as polysemy, while foundational, might offer limited insight when placed in the vast, multifaceted landscape of human cognitive processes.

Keywords: BERT, human cognition, polysemy, semantic distance

Procedia PDF Downloads 23