Search results for: lexical semantic analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 27407

Search results for: lexical semantic analysis

27257 Lexical Features and Motivations of Product Reviews on Selected Philippine Online Shops

Authors: Jimmylen Tonio, Ali Anudin, Rochelle Irene G. Lucas

Abstract:

Alongside the progress of electronic-business websites, consumers have become more comfortable with online shopping. It has become customary for consumers that prior to purchasing a product or availing services, they consult online reviews info as bases in evaluating and deciding whether or not they should push thru with their procurement of the product or service. Subsequently, after purchasing, consumers tend to post their own comments of the product in the same e-business websites. Because of this, product reviews (PRS) have become an indispensable feature in online businesses equally beneficial for both business owners and consumers. This study explored the linguistic features and motivations of online product reviews on selected Philippine online shops, LAZADA and SHOPEE. Specifically, it looked into the lexical features of the PRs, the factors that motivated consumers to write the product reviews, and the difference of lexical preferences between male and female when they write the reviews. The findings revealed the following: 1. Formality of words in online product reviews primarily involves non-standard spelling, followed by abbreviated word forms, colloquial contractions and use of coined/novel words; 2. Paralinguistic features in online product reviews are dominated by the use of emoticons, capital letters and punctuations followed by the use of pictures/photos and lastly, by paralinguistic expressions; 3. The factors that motivate consumers to write product reviews varied. Online product reviewers are predominantly driven by venting negative feelings motivation, followed by helping the company, helping other consumers, positive self-enhancement, advice seeking and lastly, by social benefits; and 4. Gender affects the word frequencies of product online reviews, while negation words, personal pronouns, the formality of words, and paralinguistic features utilized by both male and female online product reviewers are not different.

Keywords: lexical choices, motivation, online shop, product reviews

Procedia PDF Downloads 130
27256 Pali-Sanskrit Terms and Their Uses in Reflecting Political Society of Thailand

Authors: Kowit Pimpuang

Abstract:

Through analysis of the Pali-Sanskrit (PL-SKT) terms and their uses in reflecting political society of Thailand, the objectives of this study were to explore PL-SKT word formation and its semantic changes employed in the political society of Thailand and to explore the political reflection of Thai society through their uses. Conceptual framework of this study consists of (1) use of PL-SKT word formation namely, primary derivative (Kitaka), secondary derivative (Tathita), compound (Samasa) and prefix (Upasagga), (2) semantic changes namely; widening, narrowing and transferring of meaning, and (3) political reflection of Thai society. Qualitative method was employed in this study and data were collected from Thai Newspapers. It was found that there were uses of the four kinds of word formation in formatting the new political terms concerned namely, primary derivative, secondary derivative, compound and prefix leading by compound through the following three semantic changes; widening, narrowing and transferring, in order to make clear in understanding. Furthermore, PL-SKT terms were employed in reflecting Thai politics caused by democratic conflicts through the bureaucracy, plutocracy, businessocracy and juristocracy respectively. Later, there have been political business groups and their corruption problems in political society of Thailand.

Keywords: Pali, Sanskrit, reflection, politics, Thailand

Procedia PDF Downloads 255
27255 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 49
27254 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution

Authors: Haiyan Wu, Ying Liu, Shaoyun Shi

Abstract:

Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.

Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction

Procedia PDF Downloads 115
27253 Lexical Based Method for Opinion Detection on Tripadvisor Collection

Authors: Faiza Belbachir, Thibault Schienhinski

Abstract:

The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.

Keywords: Tripadvisor, opinion detection, SentiWordNet, trust score

Procedia PDF Downloads 173
27252 Testing the Simplification Hypothesis in Constrained Language Use: An Entropy-Based Approach

Authors: Jiaxin Chen

Abstract:

Translations have been labeled as more simplified than non-translations, featuring less diversified and more frequent lexical items and simpler syntactic structures. Such simplified linguistic features have been identified in other bilingualism-influenced language varieties, including non-native and learner language use. Therefore, it has been proposed that translation could be studied within a broader framework of constrained language, and simplification is one of the universal features shared by constrained language varieties due to similar cognitive-physiological and social-interactive constraints. Yet contradicting findings have also been presented. To address this issue, this study intends to adopt Shannon’s entropy-based measures to quantify complexity in language use. Entropy measures the level of uncertainty or unpredictability in message content, and it has been adapted in linguistic studies to quantify linguistic variance, including morphological diversity and lexical richness. In this study, the complexity of lexical and syntactic choices will be captured by word-form entropy and pos-form entropy, and a comparison will be made between constrained and non-constrained language use to test the simplification hypothesis. The entropy-based method is employed because it captures both the frequency of linguistic choices and their evenness of distribution, which are unavailable when using traditional indices. Another advantage of the entropy-based measure is that it is reasonably stable across languages and thus allows for a reliable comparison among studies on different language pairs. In terms of the data for the present study, one established (CLOB) and two self-compiled corpora will be used to represent native written English and two constrained varieties (L2 written English and translated English), respectively. Each corpus consists of around 200,000 tokens. Genre (press) and text length (around 2,000 words per text) are comparable across corpora. More specifically, word-form entropy and pos-form entropy will be calculated as indicators of lexical and syntactical complexity, and ANOVA tests will be conducted to explore if there is any corpora effect. It is hypothesized that both L2 written English and translated English have lower entropy compared to non-constrained written English. The similarities and divergences between the two constrained varieties may provide indications of the constraints shared by and peculiar to each variety.

Keywords: constrained language use, entropy-based measures, lexical simplification, syntactical simplification

Procedia PDF Downloads 71
27251 An AI-generated Semantic Communication Platform in HCI Course

Authors: Yi Yang, Jiasong Sun

Abstract:

Almost every aspect of our daily lives is now intertwined with some degree of human-computer interaction (HCI). HCI courses draw on knowledge from disciplines as diverse as computer science, psychology, design principles, anthropology, and more. Our HCI courses, named the Media and Cognition course, are constantly updated to reflect state-of-the-art technological advancements such as virtual reality, augmented reality, and artificial intelligence-based interactions. For more than a decade, our course has used an interest-based approach to teaching, in which students proactively propose some research-based questions and collaborate with teachers, using course knowledge to explore potential solutions. Semantic communication plays a key role in facilitating understanding and interaction between users and computer systems, ultimately enhancing system usability and user experience. The advancements in AI-generated technology, which have gained significant attention from both academia and industry in recent years, are exemplified by language models like GPT-3 that generate human-like dialogues from given prompts. Our latest version of the Human-Computer Interaction course practices a semantic communication platform based on AI-generated techniques. The purpose of this semantic communication is twofold: to extract and transmit task-specific information while ensuring efficient end-to-end communication with minimal latency. An AI-generated semantic communication platform evaluates the retention of signal sources and converts low-retain ability visual signals into textual prompts. These data are transmitted through AI-generated techniques and reconstructed at the receiving end; on the other hand, visual signals with a high retain ability rate are compressed and transmitted according to their respective regions. The platform and associated research are a testament to our students' growing ability to independently investigate state-of-the-art technologies.

Keywords: human-computer interaction, media and cognition course, semantic communication, retainability, prompts

Procedia PDF Downloads 82
27250 Understanding the Interactive Nature in Auditory Recognition of Phonological/Grammatical/Semantic Errors at the Sentence Level: An Investigation Based upon Japanese EFL Learners’ Self-Evaluation and Actual Language Performance

Authors: Hirokatsu Kawashima

Abstract:

One important element of teaching/learning listening is intensive listening such as listening for precise sounds, words, grammatical, and semantic units. Several classroom-based investigations have been conducted to explore the usefulness of auditory recognition of phonological, grammatical and semantic errors in such a context. The current study reports the results of one such investigation, which targeted auditory recognition of phonological, grammatical, and semantic errors at the sentence level. 56 Japanese EFL learners participated in this investigation, in which their recognition performance of phonological, grammatical and semantic errors was measured on a 9-point scale by learners’ self-evaluation from the perspective of 1) two types of similar English sound (vowel and consonant minimal pair words), 2) two types of sentence word order (verb phrase-based and noun phrase-based word orders), and 3) two types of semantic consistency (verb-purpose and verb-place agreements), respectively, and their general listening proficiency was examined using standardized tests. A number of findings have been made about the interactive relationships between the three types of auditory error recognition and general listening proficiency. Analyses based on the OPLS (Orthogonal Projections to Latent Structure) regression model have disclosed, for example, that the three types of auditory error recognition are linked in a non-linear way: the highest explanatory power for general listening proficiency may be attained when quadratic interactions between auditory recognition of errors related to vowel minimal pair words and that of errors related to noun phrase-based word order are embraced (R2=.33, p=.01).

Keywords: auditory error recognition, intensive listening, interaction, investigation

Procedia PDF Downloads 493
27249 The Development of Chinese-English Homophonic Word Pairs Databases for English Teaching and Learning

Authors: Yuh-Jen Wu, Chun-Min Lin

Abstract:

Homophonic words are common in Mandarin Chinese which belongs to the tonal language family. Using homophonic cues to study foreign languages is one of the learning techniques of mnemonics that can aid the retention and retrieval of information in the human memory. When learning difficult foreign words, some learners transpose them with words in a language they are familiar with to build an association and strengthen working memory. These phonological clues are beneficial means for novice language learners. In the classroom, if mnemonic skills are used at the appropriate time in the instructional sequence, it may achieve their maximum effectiveness. For Chinese-speaking students, proper use of Chinese-English homophonic word pairs may help them learn difficult vocabulary. In this study, a database program is developed by employing Visual Basic. The database contains two corpora, one with Chinese lexical items and the other with English ones. The Chinese corpus contains 59,053 Chinese words that were collected by a web crawler. The pronunciations of this group of words are compared with words in an English corpus based on WordNet, a lexical database for the English language. Words in both databases with similar pronunciation chunks and batches are detected. A total of approximately 1,000 Chinese lexical items are located in the preliminary comparison. These homophonic word pairs can serve as a valuable tool to assist Chinese-speaking students in learning and memorizing new English vocabulary.

Keywords: Chinese, corpus, English, homophonic words, vocabulary

Procedia PDF Downloads 157
27248 Examining the Effects of Increasing Lexical Retrieval Attempts in Tablet-Based Naming Therapy for Aphasia

Authors: Jeanne Gallee, Sofia Vallila-Rohter

Abstract:

Technology-based applications are increasingly being utilized in aphasia rehabilitation as a means of increasing intensity of treatment and improving accessibility to treatment. These interactive therapies, often available on tablets, lead individuals to complete language and cognitive rehabilitation tasks that draw upon skills such as the ability to name items, recognize semantic features, count syllables, rhyme, and categorize objects. Tasks involve visual and auditory stimulus cues and provide feedback about the accuracy of a person’s response. Research has begun to examine the efficacy of tablet-based therapies for aphasia, yet much remains unknown about how individuals interact with these therapy applications. Thus, the current study aims to examine the efficacy of a tablet-based therapy program for anomia, further examining how strategy training might influence the way that individuals with aphasia engage with and benefit from therapy. Individuals with aphasia are enrolled in one of two treatment paradigms: traditional therapy or strategy therapy. For ten weeks, all participants receive 2 hours of weekly in-house therapy using Constant Therapy, a tablet-based therapy application. Participants are provided with iPads and are additionally encouraged to work on therapy tasks for one hour a day at home (home logins). For those enrolled in traditional therapy, in-house sessions involve completing therapy tasks while a clinician researcher is present. For those enrolled in the strategy training group, in-house sessions focus on limiting cue use in order to maximize lexical retrieval attempts and naming opportunities. The strategy paradigm is based on the principle that retrieval attempts may foster long-term naming gains. Data have been collected from 7 participants with aphasia (3 in the traditional therapy group, 4 in the strategy training group). We examine cue use, latency of responses and accuracy through the course of therapy, comparing results across group and setting (in-house sessions vs. home logins).

Keywords: aphasia, speech-language pathology, traumatic brain injury, language

Procedia PDF Downloads 181
27247 Multimodal Discourse, Logic of the Analysis of Transmedia Strategies

Authors: Bianca Suárez Puerta

Abstract:

Multimodal discourse refers to a method of study the media continuum between reality, screens as a device, audience, author, and media as a production from the audience. For this study we used semantic differential, a method proposed in the sixties by Osgood, Suci and Tannenbaum, starts from the assumption that under each particular way of perceiving the world, in each singular idea, there is a common cultural meaning that organizes experiences. In relation to these shared symbolic dimension, this method has had significant results, as it focuses on breaking down the meaning of certain significant acts into series of statements that place the subjects in front of some concepts. In Colombia, in 2016, a tool was designed to measure the meaning of a multimodal production, specially the acts of sense of transmedia productions that managed to receive funds from the Ministry of ICT of Colombia, and also, to analyze predictable patterns that can be found in calls and funds aimed at the production of culture in Colombia, in the context of the peace agreement, as a request for expressions from a hegemonic place, seeking to impose a worldview.

Keywords: semantic differential, semiotics, transmedia, critical analysis of discourse

Procedia PDF Downloads 190
27246 A Semantic Registry to Support Brazilian Aeronautical Web Services Operations

Authors: Luís Antonio de Almeida Rodriguez, José Maria Parente de Oliveira, Ednelson Oliveira

Abstract:

In the last two decades, the world’s aviation authorities have made several attempts to create consensus about a global and accepted approach for applying semantics to web services registry descriptions. This problem has led communities to face a fat and disorganized infrastructure to describe aeronautical web services. It is usual for developers to implement ad-hoc connections among consumers and providers and manually create non-standardized service compositions, which need some particular approach to compose and semantically discover a desired web service. Current practices are not precise and tend to focus on lightweight specifications of some parts of the OWL-S and embed them into syntactic descriptions (SOAP artifacts and OWL language). It is necessary to have the ability to manage the use of both technologies. This paper presents an implementation of the ontology OWL-S that describes a Brazilian Aeronautical Web Service Registry, which makes it able to publish, advertise, make multi-criteria semantic discovery aligned with the ideas of the System Wide Information Management (SWIM) Program, and invoke web services within the Air Traffic Management context. The proposal’s best finding is a generic approach to describe semantic web services. The paper also presents a set of functional requirements to guide the ontology development and to compare them to the results to validate the implementation of the OWL-S Ontology.

Keywords: aeronautical web services, OWL-S, semantic web services discovery, ontologies

Procedia PDF Downloads 67
27245 The Impact of Trait and Mathematical Anxiety on Oscillatory Brain Activity during Lexical and Numerical Error-Recognition Tasks

Authors: Alexander N. Savostyanov, Tatyana A. Dolgorukova, Elena A. Esipenko, Mikhail S. Zaleshin, Margherita Malanchini, Anna V. Budakova, Alexander E. Saprygin, Yulia V. Kovas

Abstract:

The present study compared spectral-power indexes and cortical topography of brain activity in a sample characterized by different levels of trait and mathematical anxiety. 52 healthy Russian-speakers (age 17-32; 30 males) participated in the study. Participants solved an error recognition task under 3 conditions: A lexical condition (simple sentences in Russian), and two numerical conditions (simple arithmetic and complicated algebraic problems). Trait and mathematical anxiety were measured using self-repot questionnaires. EEG activity was recorded simultaneously during task execution. Event-related spectral perturbations (ERSP) were used to analyze spectral-power changes in brain activity. Additionally, sLORETA was applied in order to localize the sources of brain activity. When exploring EEG activity recorded after tasks onset during lexical conditions, sLORETA revealed increased activation in frontal and left temporal cortical areas, mainly in the alpha/beta frequency ranges. When examining the EEG activity recorded after task onset during arithmetic and algebraic conditions, additional activation in delta/theta band in the right parietal cortex was observed. The ERSP plots reveled alpha/beta desynchronizations within a 500-3000 ms interval after task onset and slow-wave synchronization within an interval of 150-350 ms. Amplitudes of these intervals reflected the accuracy of error recognition, and were differently associated with the three (lexical, arithmetic and algebraic) conditions. The level of trait anxiety was positively correlated with the amplitude of alpha/beta desynchronization. The level of mathematical anxiety was negatively correlated with the amplitude of theta synchronization and of alpha/beta desynchronization. Overall, trait anxiety was related with an increase in brain activation during task execution, whereas mathematical anxiety was associated with increased inhibitory-related activity. We gratefully acknowledge the support from the №11.G34.31.0043 grant from the Government of the Russian Federation.

Keywords: anxiety, EEG, lexical and numerical error-recognition tasks, alpha/beta desynchronization

Procedia PDF Downloads 506
27244 Semantic Based Analysis in Complaint Management System with Analytics

Authors: Francis Alterado, Jennifer Enriquez

Abstract:

Semantic Based Analysis in Complaint Management System with Analytics is an enhanced tool of providing complaints by the clients as well as a mechanism for Palawan Polytechnic College to gather, process, and monitor status of these complaints. The study has a mobile application that serves as a remote facility of communication between the students and the school management on the issues encountered by the student and the solution of every complaint received. In processing the complaints, text mining and clustering algorithms were utilized. Every module of the systems was tested and based on the results; these are 100% free from error before integration was done. A system testing was also done by checking the expected functionality of the system which was 100% functional. The system was tested by 10 students by forwarding complaints to 10 departments. Based on results, the students were able to submit complaints, the system was able to process accordingly by identifying to which department the complaints are intended, and the concerned department was able to give feedback on the complaint received to the student. With this, the system gained 4.7 rating which means Excellent.

Keywords: technology adoption, emerging technology, issues challenges, algorithm, text mining, mobile technology

Procedia PDF Downloads 176
27243 Evaluation and Compression of Different Language Transformer Models for Semantic Textual Similarity Binary Task Using Minority Language Resources

Authors: Ma. Gracia Corazon Cayanan, Kai Yuen Cheong, Li Sha

Abstract:

Training a language model for a minority language has been a challenging task. The lack of available corpora to train and fine-tune state-of-the-art language models is still a challenge in the area of Natural Language Processing (NLP). Moreover, the need for high computational resources and bulk data limit the attainment of this task. In this paper, we presented the following contributions: (1) we introduce and used a translation pair set of Tagalog and English (TL-EN) in pre-training a language model to a minority language resource; (2) we fine-tuned and evaluated top-ranking and pre-trained semantic textual similarity binary task (STSB) models, to both TL-EN and STS dataset pairs. (3) then, we reduced the size of the model to offset the need for high computational resources. Based on our results, the models that were pre-trained to translation pairs and STS pairs can perform well for STSB task. Also, having it reduced to a smaller dimension has no negative effect on the performance but rather has a notable increase on the similarity scores. Moreover, models that were pre-trained to a similar dataset have a tremendous effect on the model’s performance scores.

Keywords: semantic matching, semantic textual similarity binary task, low resource minority language, fine-tuning, dimension reduction, transformer models

Procedia PDF Downloads 182
27242 English Pashto Contact: Morphological Adaptation of Bilingual Compound Words in Pashto

Authors: Imran Ullah Imran

Abstract:

Language contact is a familiar concept in the present global world. Across the globe, languages get mixed up at different levels. Borrowing, code-switching are some of the means through which languages interact. This study examines Pashto-English contact at word and syllable levels. By recording the speech of 30 Pashto native speakers, selected via 'social network' sampling, the study located a number of Pashto-English compound words, which is a unique contact of its kind. In data analysis, tokens were categorized on the basis of their pattern and morphological structure. The study shows that Pashto-English Bilingual Compound words (BCWs) are very prevalent in the Pashto language. The study also found that the BCWs in Pashto are completely productive and have their own meanings. It also shows that the dominant pattern of hybrid words in Pashto is the conjugation of an independent English root word followed by a Pashto inflectional morpheme, which contributes to the core semantic content of the construction. The BCWs construction shows that how both the languages are closer to each other. Pashto-English contact results into bilingual compound and hybrid words, which forms a considerable number of tokens in the present-day spoken Pashto. On the basis of these findings, the study assumes that the same phenomenon may increase with the passage of time that would, in turn, result in the formation of more bilingual compound or hybrid words.

Keywords: code-mixing, bilingual compound words, pashto-english contact, hybrid words, inflectional lexical morpheme

Procedia PDF Downloads 223
27241 Contentious Issues Concerning the Methodology of Using the Lexical Approach in Teaching ESP

Authors: Elena Krutskikh, Elena Khvatova

Abstract:

In tertiary settings expanding students’ vocabulary and teaching discursive competence is seen as one of the chief goals of a professional development course. However, such a focus often is detrimental to students’ cognitive competences, such as analysis, synthesis, and creative processing of information, and deprives students of motivation for self-improvement and self-development of language skills. The presentation is going to argue that in an ESP course special attention should be paid to reading/listening which can promote understanding and using the language as a tool for solving significant real world problems, including professional ones. It is claimed that in the learning process it is necessary to maintain a balance between the content and the linguistic aspect of the educational process as language acquisition is inextricably linked with mental activity and the need to express oneself is a primary stimulus for using a language. A study conducted among undergraduates indicates that they place a premium on quality materials that motivate them and stimulate their further linguistic and professional development. Thus, more demands are placed on study materials that should contain new information for students and serve not only as a source of new vocabulary but also prepare them for real tasks related to professional activities.

Keywords: critical reading, english for professional development, english for specific purposes, high order thinking skills, lexical approach, vocabulary acquisition

Procedia PDF Downloads 147
27240 Characteristic Sentence Stems in Academic English Texts: Definition, Identification, and Extraction

Authors: Jingjie Li, Wenjie Hu

Abstract:

Phraseological units in academic English texts have been a central focus in recent corpus linguistic research. A wide variety of phraseological units have been explored, including collocations, chunks, lexical bundles, patterns, semantic sequences, etc. This paper describes a special category of clause-level phraseological units, namely, Characteristic Sentence Stems (CSSs), with a view to describing their defining criteria and extraction method. CSSs are contiguous lexico-grammatical sequences which contain a subject-predicate structure and which are frame expressions characteristic of academic writing. The extraction of CSSs consists of six steps: Part-of-speech tagging, n-gram segmentation, structure identification, significance of occurrence calculation, text range calculation, and overlapping sequence reduction. Significance of occurrence calculation is the crux of this study. It includes the computing of both the internal association and the boundary independence of a CSS and tests the occurring significance of the CSS from both inside and outside perspectives. A new normalization algorithm is also introduced into the calculation of LocalMaxs for reducing overlapping sequences. It is argued that many sentence stems are so recurrent in academic texts that the most typical of them have become the habitual ways of making meaning in academic writing. Therefore, studies of CSSs could have potential implications and reference value for academic discourse analysis, English for Academic Purposes (EAP) teaching and writing.

Keywords: characteristic sentence stem, extraction method, phraseological unit, the statistical measure

Procedia PDF Downloads 145
27239 Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts

Authors: Yassine Jamoussi, Ameni Youssfi, Henda Ben Ghezala

Abstract:

With the growing interest in social networking, the interaction of social actors evolved to a source of knowledge in which it becomes possible to perform context aware-reasoning. The information extraction from social networking especially Twitter and Facebook is one of the problems in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit from the lexical features developed for Twitter and online conversational text in previous works, and to develop an extraction model for constructing a huge knowledge based on actions

Keywords: social networking, information extraction, part-of-speech tagging, natural language processing

Procedia PDF Downloads 280
27238 Clarifier Dialogue Interface to resolve linguistic ambiguities in E-Learning Environment

Authors: Dalila Souilem, Salma Boumiza, Abdelkarim Abdelkader

Abstract:

The Clarifier Dialogue Interface (CDI) is a part of an online teaching system based on human-machine communication in learning situation. This interface used in the system during the learning action specifically in the evaluation step, to clarify ambiguities in the learner's response. The CDI can generate patterns allowing access to an information system, using the selectors associated with lexical units. To instantiate these patterns, the user request (especially learner’s response), must be analyzed and interpreted to deduce the canonical form, the semantic form and the subject of the sentence. For the efficiency of this interface at the interpretation level, a set of substitution operators is carried out in order to extend the possibilities of manipulation with a natural language. A second approach that will be presented in this paper focuses on the object languages with new prospects such as combination of natural language with techniques of handling information system in the area of online education. So all operators, the CDI and other interfaces associated to the domain expertise and teaching strategies will be unified using FRAME representation form.

Keywords: dialogue, e-learning, FRAME, information system, natural language

Procedia PDF Downloads 354
27237 Comparative between Different Methodological Procedures Used to Obtain Information on the First Lexical Development in Bilingual Basque-Spanish Children

Authors: Asier Romero Andonegi, Irati De Pablo Delgado

Abstract:

The objective of this study is to explore the different methodological procedures that are used to obtain information on the early linguistic development of children. To this end, two different methodological procedures were carried out on the same sample: on the one hand, the MacArthur-Bates Communicative Development Inventories, in its adaptations in Spanish and Basque; and on the other hand, longitudinal observation through professional software: ELAN and CHAT. The sample consists of 8 Basque children/ages 16 to 30 months with different mother tongue (L1). The results show the usefulness of inventories in obtaining information on the development of early communication and language skills, but also their limitations mostly focused on the interpretive overvaluation of their children’s lexical development.

Keywords: early language development, language evaluation, lexicon, MacArthur-Bates communicative development inventories

Procedia PDF Downloads 135
27236 An Intensional Conceptualization Model for Ontology-Based Semantic Integration

Authors: Fateh Adhnouss, Husam El-Asfour, Kenneth McIsaac, AbdulMutalib Wahaishi, Idris El-Feghia

Abstract:

Conceptualization is an essential component of semantic ontology-based approaches. There have been several approaches that rely on extensional structure and extensional reduction structure in order to construct conceptualization. In this paper, several limitations are highlighted relating to their applicability to the construction of conceptualizations in dynamic and open environments. These limitations arise from a number of strong assumptions that do not apply to such environments. An intensional structure is strongly argued to be a natural and adequate modeling approach. This paper presents a conceptualization structure based on property relations and propositions theory (PRP) to the model ontology that is suitable for open environments. The model extends the First-Order Logic (FOL) notation and defines the formal representation that enables interoperability between software systems and supports semantic integration for software systems in open, dynamic environments.

Keywords: conceptualization, ontology, extensional structure, intensional structure

Procedia PDF Downloads 89
27235 Turkish University Level EFL Learners’ Collocational Knowledge at Receptive and Productive Levels

Authors: Nazife Duygu Bagci

Abstract:

Collocations are an important part of vocabulary knowledge, and it is a subject that has recently attracted attention, while still in need of more research. The aim of this study is to answer three research questions related to the collocational knowledge of Turkish university level EFL learners at different proficiency levels of English. The first research question aims to compare the pre-intermediate (PIN) and the advanced (ADV) level learners’ collocational knowledge at receptive and productive levels. The second one is to analyze the performance of the PIN and the ADV students in two main collocation categories; lexical and grammatical. Lastly, the performance of both groups are focused on to find the collocation type (among verb-noun, adjective- noun, adjective-preposition, noun-preposition collocation types) they show the best performance in. Two offline tests were used to answer these questions. The results show that there is a significant difference between the PIN and the ADV groups at both receptive and productive levels. It can be concluded that proficiency is an important criterion in collocational knowledge, and learners do not necessarily know the collocates of the vocabulary items that they know. Although there is no significant difference between the PIN group’s performance in lexical and grammatical collocations, the ADV group showed a better performance in lexical collocations. Lastly, the PIN group at receptive and the ADV group at both receptive and productive levels showed the best performance in verb-noun collocations, which is in line with the previous research focusing on different collocation types.

Keywords: collocational knowledge, EFL, language proficiency, testing

Procedia PDF Downloads 369
27234 Story of Per-: The Radial Network of One Lithuanian Prefix

Authors: Samanta Kietytė

Abstract:

The object of this study is the verbal derivatives stemming from the Lithuanian prefix per-. The prefix under examination can be classified as prepositional, having descended from the preposition per, thereby sharing the same prototypical meaning – denoting movement OVER. These frequently co-occur within sentences (1). The aim of this paper is to conduct a semantic analysis of the prefix per- and to propose a possible radial network of its meanings. In essence, the aim is to identify the interrelationships existing between its meanings. 1) Jis peršoko per tvorą/ 3SG.NOM.M jump.PST.3 over fence.ACC.SG. /ʻHe jumped over the fenceʼ. The foundation of this work lies in the methodological and theoretical framework of cognitive linguistics. The prototypical meaning of prefixes consistently embodies spatial dimensions that can be described through image schemas. This entails the identification of the trajectory, the landmark, and the relation between them in the situation described by the prefixed verb. The meanings of linguistic units are not perceived as arbitrary, but rather, they are interconnected through semantic motivation. According to this perspective, a singular meaning within linguistic units is considered as prototypical, while additional meanings are descended (not necessarily directly) from it. For example, one of the per- meanings TRANSFER (2) is derived from the prototypical meaning OVER. 2) Prašau persiųsti vadovo laišką man./ Ask.PRS.1 forward.INF manager.GEN.SG email.ACC.SG 1.SG.DAT/ ʻPlease forward the manager‘s email to meʼ. Certain semantic relations are explained by the conceptual metaphor and metonymy theory. For instances, when prefixed verb has a meaning WIN (3) it is related to the prototypical meaning. In this case, the prefixed verb describes situations of winning in various ways. In the prototypical meaning, the trajector moves higher than the landmark, and winning is metaphorically perceived as being higher. 3) Sūnus peraugo tėvą./ Son.NOM.SG outgrow.PST.3 father.ACC.SG/ ʻThe son has outgrown the fatherʼ. The data utilized for this study was collected from the 2014 grammatically annotated text "Lithuanian Web (LithuanianWaC v2)", consisting of 63,645,700 words. Given that the corpus is grammatically lemmatized, the list of the 793 items was obtained using the wordlist function and specifying that verbs starting with per were searched. The list included not only prefixed verbs but also other verbs whose roots have the same letter sequences as prefixes. Also, words with misspellings, without diacritical marks, and words listed for lemmatization errors were rejected, and a total of 475 derivatives were left for further analysis. The semantic analysis revealed that there are 12 distinct meanings of the prefix per-. The spatial meanings were extracted by determining what a trajector is, what a landmark is, and what the relation between them is. The connection between non-spatial meanings and spatial ones occurs through semantic motivation established by identifying elements that correspond to the trajector and landmark. The analysis reveals that there are no strict boundaries among these meanings, instead showing a continuum that encompasses a central core and a peripheral association with their internal structure, i.e., some derivatives are more prototypical of a particular meaning than others.

Keywords: word-formation, cognitive semantics, metaphor, radial networks, prototype theory, prefix

Procedia PDF Downloads 46
27233 Argument Representation in Non-Spatial Motion Bahasa Melayu Based Conceptual Structure Theory

Authors: Nurul Jamilah Binti Rosly

Abstract:

The typology of motion must be understood as a change from one location to another. But from a conceptual point of view, motion can also occur in non-spatial contexts associated with human and social factors. Therefore, from the conceptual point of view, the concept of non-spatial motion involves the movement of time, ownership, identity, state, and existence. Accordingly, this study will focus on the lexical as shared, accept, be, store, and exist as the study material. The data in this study were extracted from the Database of Languages and Literature Corpus Database, Malaysia, which was analyzed using semantics and syntax concepts using Conceptual Structure Theory - Ray Jackendoff (2002). Semantic representations are represented in the form of conceptual structures in argument functions that include functions [events], [situations], [objects], [paths] and [places]. The findings show that the mapping of these arguments comprises three main stages, namely mapping the argument structure, mapping the tree, and mapping the role of thematic items. Accordingly, this study will show the representation of non- spatial Malay language areas.

Keywords: arguments, concepts, constituencies, events, situations, thematics

Procedia PDF Downloads 111
27232 The Presence of Anglicisms in Italian Fashion Magazines and Fashion Blogs

Authors: Vivian Orsi

Abstract:

The present research investigates the lexicon of a fashion magazine, whose universe is very receptive to lexical loans, especially those from English, called Anglicisms. Specifically, we intend to discuss the presence of English items and expressions in the Vogue Italia fashion magazine. Besides, we aim to study the anglicisms used in an Italian fashion blog called The Blonde Salad. Within the discussion of fashion blogs and their contributions to scientific studies, we adopt the theories of Lexicology / Lexicography to define Anglicism (BIDERMAN, 2001), and the observation of its prestige in the Italian Language (ROGATO, 2008; BISETTO, 2003). According to the theoretical basis mentioned, we intend to make a brief analysis of the Anglicisms collected from posts of the first year of existence of such fashion blog, emphasizing also the keywords that have the role to encapsulate the content of the text, allowing the reader to retrieve information from the post of the blog. About the use of English in Italian magazines and blogs, we can affirm that it seems to represent sophistication, assuming the value of prerequisite to participate in the fashion centers of the world. Besides, we believe, as Barthes says (1990, p. 215), that “Fashion does not evolve, it changes: its lexicon is new each year, like that of a language which always keeps the same system but suddenly and regularly ‘changes’ the currency of its words”. Fashion is a mode of communication: it is present in man's interaction with the world, which means that such lexical universe is represented according to the particularities of each culture.

Keywords: anglicism, lexicology, magazines, blogs, fashion

Procedia PDF Downloads 308
27231 Affective Transparency in Compound Word Processing

Authors: Jordan Gallant

Abstract:

In the compound word processing literature, much attention has been paid to the relationship between a compound’s denotational meaning and that of its morphological whole-word constituents, which is referred to as ‘semantic transparency’. However, the parallel relationship between a compound’s connotation and that of its constituents has not been addressed at all. For instance, while a compound like ‘painkiller’ might be semantically transparent, it is not ‘affectively transparent’. That is, both constituents have primarily negative connotations, while the whole compound has a positive one. This paper investigates the role of affective transparency on compound processing using two methodologies commonly employed in this field: a lexical decision task and a typing task. The critical stimuli used were 112 English bi-constituent compounds that differed in terms of the effective transparency of their constituents. Of these, 36 stimuli contained constituents with similar connotations to the compound (e.g., ‘dreamland’), 36 contained constituents with more positive connotations (e.g. ‘bedpan’), and 36 contained constituents with more negative connotations (e.g. ‘painkiller’). Connotation of whole-word constituents and compounds were operationalized via valence ratings taken from an off-line ratings database. In Experiment 1, compound stimuli and matched non-word controls were presented visually to participants, who were then asked to indicate whether it was a real word in English. Response times and accuracy were recorded. In Experiment 2, participants typed compound stimuli presented to them visually. Individual keystroke response times and typing accuracy were recorded. The results of both experiments provided positive evidence that compound processing is influenced by effective transparency. In Experiment 1, compounds in which both constituents had more negative connotations than the compound itself were responded to significantly more slowly than compounds in which the constituents had similar or more positive connotations. Typed responses from Experiment 2 showed that inter-keystroke intervals at the morphological constituent boundary were significantly longer when the connotation of the head constituent was either more positive or more negative than that of the compound. The interpretation of this finding is discussed in the context of previous compound typing research. Taken together, these findings suggest that affective transparency plays a role in the recognition, storage, and production of English compound words. This study provides a promising first step in a new direction for research on compound words.

Keywords: compound processing, semantic transparency, typed production, valence

Procedia PDF Downloads 102
27230 Factorization of Computations in Bayesian Networks: Interpretation of Factors

Authors: Linda Smail, Zineb Azouz

Abstract:

Given a Bayesian network relative to a set I of discrete random variables, we are interested in computing the probability distribution P(S) where S is a subset of I. The general idea is to write the expression of P(S) in the form of a product of factors where each factor is easy to compute. More importantly, it will be very useful to give an interpretation of each of the factors in terms of conditional probabilities. This paper considers a semantic interpretation of the factors involved in computing marginal probabilities in Bayesian networks. Establishing such a semantic interpretations is indeed interesting and relevant in the case of large Bayesian networks.

Keywords: Bayesian networks, D-Separation, level two Bayesian networks, factorization of computation

Procedia PDF Downloads 503
27229 On Supporting a Meta-Design Approach in Socio-Technical Ontology Engineering

Authors: Mesnan Silalahi, Dana Indra Sensuse, Indra Budi

Abstract:

Many research have revealed the fact of the complexity of ontology building process that there is a need to have a new approach which addresses the socio-technical aspects in the collaboration to reach a consensus. Meta-design approach is considered applicable as a method in the methodological model in a socio-technical ontology engineering. Principles in the meta-design framework is applied in the construction phases on the ontology. A portal is developed to support the meta-design principles requirements. To validate the methodological model semantic web applications were developed and integrated in the portal and also used as a way to show the usefulness of the ontology. The knowledge based system will be filled with data of Indonesian medicinal plants. By showing the usefulness of the developed ontology in a web semantic application, we motivate all stakeholders to participate in the development of knowledge based system of medicinal plants in Indonesia.

Keywords: socio-technical, metadesign, ontology engineering methodology, semantic web application

Procedia PDF Downloads 420
27228 The Nature of Borrowings into Arabic during Different Historical Periods

Authors: Maria L. Swanson

Abstract:

Language is a system which constantly changes and reflects social and cultural transformations of a speech community. If it is phonetic system, morphological patterns and syntactic arrangements undergo little charge and are not easily transferable from one language to another, the lexicon has a high degree of flexibility. Borrowings in Arabic have always been an interesting and important subject of study to various fields of linguistics, history and culturology, and there is quite number of works devoted to this subject (al-Khalīl, Sībawīḥ, Jeffery, Belkin, al-Maghribii, Holes, Stetkevich, el-Mawlūdī, between many others). At the same time, the history of borrowing has never been described as a process starting from its originating and up to the present time. Most of the researches study lexical and morphological adaptation of borrowed words for specific or several historical periods or delineate this process on the whole. Meanwhile, we have described the whole history of borrowings in Arabic with the brief depicting of lexical and morphological specifics for each historical period using quantitative method through dividing Arabic borrowings into several groups, basing on the specific of their adaptation of new vocabulary which is tightly related to the global transformations in the Arabic history. We explain reasons for borrowings of specific lexical layers for each historical period together with the description of its morphological specifics. We also use qualitative approach through performing statistics about the share of loan vocabulary in Arabic during different periods and the percentage of borrowings from donor languages. The history of a character and amount of borrowings is a good resource for theoretical and practical lexicography and morphology studies. It is also beneficial for researchers in the field of global and specific national, political and social developments, and different types of contacts.

Keywords: anthropological linguistics, borrowings, historical linguistics, sociolinguistics

Procedia PDF Downloads 422