Search results for: semantic sentiment analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 27197

Search results for: semantic sentiment analysis

27017 Parallel Querying of Distributed Ontologies with Shared Vocabulary

Authors: Sharjeel Aslam, Vassil Vassilev, Karim Ouazzane

Abstract:

Ontologies and various semantic repositories became a convenient approach for implementing model-driven architectures of distributed systems on the Web. SPARQL is the standard query language for querying such. However, although SPARQL is well-established standard for querying semantic repositories in RDF and OWL format and there are commonly used APIs which supports it, like Jena for Java, its parallel option is not incorporated in them. This article presents a complete framework consisting of an object algebra for parallel RDF and an index-based implementation of the parallel query engine capable of dealing with the distributed RDF ontologies which share common vocabulary. It has been implemented in Java, and for validation of the algorithms has been applied to the problem of organizing virtual exhibitions on the Web.

Keywords: distributed ontologies, parallel querying, semantic indexing, shared vocabulary, SPARQL

Procedia PDF Downloads 170
27016 From Shallow Semantic Representation to Deeper One: Verb Decomposition Approach

Authors: Aliaksandr Huminski

Abstract:

Semantic Role Labeling (SRL) as shallow semantic parsing approach includes recognition and labeling arguments of a verb in a sentence. Verb participants are linked with specific semantic roles (Agent, Patient, Instrument, Location, etc.). Thus, SRL can answer on key questions such as ‘Who’, ‘When’, ‘What’, ‘Where’ in a text and it is widely applied in dialog systems, question-answering, named entity recognition, information retrieval, and other fields of NLP. However, SRL has the following flaw: Two sentences with identical (or almost identical) meaning can have different semantic role structures. Let consider 2 sentences: (1) John put butter on the bread. (2) John buttered the bread. SRL for (1) and (2) will be significantly different. For the verb put in (1) it is [Agent + Patient + Goal], but for the verb butter in (2) it is [Agent + Goal]. It happens because of one of the most interesting and intriguing features of a verb: Its ability to capture participants as in the case of the verb butter, or their features as, say, in the case of the verb drink where the participant’s feature being liquid is shared with the verb. This capture looks like a total fusion of meaning and cannot be decomposed in direct way (in comparison with compound verbs like babysit or breastfeed). From this perspective, SRL looks really shallow to represent semantic structure. If the key point in semantic representation is an opportunity to use it for making inferences and finding hidden reasons, it assumes by default that two different but semantically identical sentences must have the same semantic structure. Otherwise we will have different inferences from the same meaning. To overcome the above-mentioned flaw, the following approach is suggested. Assume that: P is a participant of relation; F is a feature of a participant; Vcp is a verb that captures a participant; Vcf is a verb that captures a feature of a participant; Vpr is a primitive verb or a verb that does not capture any participant and represents only a relation. In another word, a primitive verb is a verb whose meaning does not include meanings from its surroundings. Then Vcp and Vcf can be decomposed as: Vcp = Vpr +P; Vcf = Vpr +F. If all Vcp and Vcf will be represented this way, then primitive verbs Vpr can be considered as a canonical form for SRL. As a result of that, there will be no hidden participants caught by a verb since all participants will be explicitly unfolded. An obvious example of Vpr is the verb go, which represents pure movement. In this case the verb drink can be represented as man-made movement of liquid into specific direction. Extraction and using primitive verbs for SRL create a canonical representation unique for semantically identical sentences. It leads to the unification of semantic representation. In this case, the critical flaw related to SRL will be resolved.

Keywords: decomposition, labeling, primitive verbs, semantic roles

Procedia PDF Downloads 343
27015 Lexical-Semantic Deficits in Sinhala Speaking Persons with Post Stroke Aphasia: Evidence from Single Word Auditory Comprehension Task

Authors: D. W. M. S. Samarathunga, Isuru Dharmarathne

Abstract:

In aphasia, various levels of symbolic language processing (semantics) are affected. It is shown that Persons with Aphasia (PWA) often experience more problems comprehending some categories of words than others. The study aimed to determine lexical semantic deficits seen in Auditory Comprehension (AC) and to describe lexical-semantic deficits across six selected word categories. Thirteen (n =13) persons diagnosed with post-stroke aphasia (PSA) were recruited to perform an AC task. Foods, objects, clothes, vehicles, body parts and animals were selected as the six categories. As the test stimuli, black and white line drawings were adapted from a picture set developed for semantic studies by Snodgrass and Vanderwart. A pilot study was conducted with five (n=5) healthy nonbrain damaged Sinhala speaking adults to decide familiarity and applicability of the test material. In the main study, participants were scored based on the accuracy and number of errors shown. The results indicate similar trends of lexical semantic deficits identified in the literature confirming ‘animals’ to be the easiest category to comprehend. Mann-Whitney U test was performed to determine the association between the selected variables and the participants’ performance on AC task. No statistical significance was found between the errors and the type of aphasia reflecting similar patterns described in aphasia literature in other languages. The current study indicates the presence of selectivity of lexical semantic deficits in AC and a hierarchy was developed based on the complexity of the categories to comprehend by Sinhala speaking PWA, which might be clinically beneficial when improving language skills of Sinhala speaking persons with post-stroke aphasia. However, further studies on aphasia should be conducted with larger samples for a longer period to study deficits in Sinhala and other Sri Lankan languages (Tamil and Malay).

Keywords: aphasia, auditory comprehension, selective lexical-semantic deficits, semantic categories

Procedia PDF Downloads 229
27014 Academic Literacy: Semantic-Discursive Resource and the Relationship with the Constitution of Genre for the Development of Writing

Authors: Lucia Rottava

Abstract:

The present study focuses on academic literacy and addresses the impact of semantic-discursive resources on the constitution of genres that are produced in such context. The research considers the development of writing in the academic context in Portuguese. Researches that address academic literacy and the characteristics of the texts produced in this context are rare, mainly with focus on the development of writing, considering three variables: the constitution of the writer, the perception of the reader/interlocutor and the organization of the informational text flow. The research aims to map the semantic-discursive resources of the written register in texts of several genres and produced by students in the first semester of the undergraduate course in letters. The hypothesis raised is that writing in the academic environment is not a recurrent literacy practice for these learners and can be explained by the ontogenetic and phylogenetic nature of language development. Qualitative in nature, the present research has as empirical data texts produced in a half-yearly course of Reading and Textual Production; these data result from the proposition of four different writing proposals, in a total of 600 texts. The corpus is analyzed based on semantic-discursive resources, seeking to contemplate relevant aspects of language (grammar, discourse and social context) that reveal the choices made in the reader/writer interrelationship and the organizational flow of the text. Among the semantic-discursive resources, the analysis includes three resources, including (a) appraisal and negotiation to understand the attitudes negotiated (roles of the participants of the discourse and their relationship with the other); (b) ideation to explain the construction of the experience (activities performed and participants); and (c) periodicity to outline the flow of information in the organization of the text according to the genre it instantiates. The results indicate the organizational difficulties of the flow of the text information. Cartography contributes to the understanding of the way writers use language in an effort to present themselves, evaluate someone else’s work, and communicate with readers.

Keywords: academic writing, portuguese mother tongue, semantic-discursive resources, sistemic funcional linguistic

Procedia PDF Downloads 103
27013 Voice of Customer: Mining Customers' Reviews on On-Line Car Community

Authors: Kim Dongwon, Yu Songjin

Abstract:

This study identifies the business value of VOC (Voice of Customer) on the business. Precisely, we intend to demonstrate how much negative and positive sentiment of VOC has an influence on car sales market share in the unites states. We extract 7 emotions such as sadness, shame, anger, fear, frustration, delight and satisfaction from the VOC data, 23,204 pieces of opinions, that had been posted on car-related on-line community from 2007 to 2009(a part of data collection from 2007 to 2015), and intend to clarify the correlation between negative and positive sentimental keywords and contribution to market share. In order to develop a lexicon for each category of negative and positive sentiment, we took advantage of Corpus program, Antconc 3.4.1.w and on-line sentimental data, SentiWordNet and identified the part of speech(POS) information of words in the customers' opinion by using a part-of-speech tagging function provided by TextAnalysisOnline. For the purpose of this present study, a total of 45,741 pieces of customers' opinions of 28 car manufacturing companies had been collected including titles and status information. We conducted an experiment to examine whether the inclusion, frequency and intensity of terms with negative and positive emotions in each category affect the adoption of customer opinions for vehicle organizations' market share. In the experiment, we statistically verified that there is correlation between customer ideas containing negative and positive emotions and variation of marker share. Particularly, "Anger," a domain of negative domains, is significantly influential to car sales market share. The domain "Delight" and "Satisfaction" increased in proportion to growth of market share.

Keywords: data mining, opinion mining, sentiment analysis, VOC

Procedia PDF Downloads 194
27012 Exploring Tweeters’ Concerns and Opinions about FIFA Arab Cup 2021: An Investigation Study

Authors: Md. Rafiul Biswas, Uzair Shah, Mohammad Alkayal, Zubair Shah, Othman Althawadi, Kamila Swart

Abstract:

Background: Social media platforms play a significant role in the mediated consumption of sport, especially so for sport mega-event. The characteristics of Twitter data (e.g., user mentions, retweets, likes, #hashtag) accumulate the users in one ground and spread information widely and quickly. Analysis of Twitter data can reflect the public attitudes, behavior, and sentiment toward a specific event on a larger scale than traditional surveys. Qatar is going to be the first Arab country to host the mega sports event FIFA World Cup 2022 (Q22). Qatar has hosted the FIFA Arab Cup 2021 (FAC21) to serve as a preparation for the mega-event. Objectives: This study investigates public sentiments and experiences about FAC21 and provides an insight to enhance the public experiences for the upcoming Q22. Method: FCA21-related tweets were downloaded using Twitter Academic research API between 01 October 2021 to 18 February 2022. Tweets were divided into three different periods: before T1 (01 Oct 2021 to 29 Nov 2021), during T2 (30 Nov 2021 -18 Dec 2021), and after the FAC21 T3 (19 Dec 2021-18 Feb 2022). The collected tweets were preprocessed in several steps to prepare for analysis; (1) removed duplicate and retweets, (2) removed emojis, punctuation, and stop words (3) normalized tweets using word lemmatization. Then, rule-based classification was applied to remove irrelevant tweets. Next, the twitter-XLM-roBERTa-base model from Huggingface was applied to identify the sentiment in the tweets. Further, state-of-the-art BertTopic modeling will be applied to identify trending topics over different periods. Results: We downloaded 8,669,875 Tweets posted by 2728220 unique users in different languages. Of those, 819,813 unique English tweets were selected in this study. After splitting into three periods, 541630, 138876, and 139307 were from T1, T2, and T3, respectively. Most of the sentiments were neutral, around 60% in different periods. However, the rate of negative sentiment (23%) was high compared to positive sentiment (18%). The analysis indicates negative concerns about FAC21. Therefore, we will apply BerTopic to identify public concerns. This study will permit the investigation of people’s expectations before FAC21 (e.g., stadium, transportation, accommodation, visa, tickets, travel, and other facilities) and ascertain whether these were met. Moreover, it will highlight public expectations and concerns. The findings of this study can assist the event organizers in enhancing implementation plans for Q22. Furthermore, this study can support policymakers with aligning strategies and plans to leverage outstanding outcomes.

Keywords: FIFA Arab Cup, FIFA, Twitter, machine learning

Procedia PDF Downloads 71
27011 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data

Authors: Muthukumarasamy Govindarajan

Abstract:

Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.

Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine

Procedia PDF Downloads 115
27010 The Oral Production of University EFL Students: An Analysis of Tasks, Format, and Quality in Foreign Language Development

Authors: Vera Lucia Teixeira da Silva, Sandra Regina Buttros Gattolin de Paula

Abstract:

The present study focuses on academic literacy and addresses the impact of semantic-discursive resources on the constitution of genres that are produced in such context. The research considers the development of writing in the academic context in Portuguese. Researches that address academic literacy and the characteristics of the texts produced in this context are rare, mainly with focus on the development of writing, considering three variables: the constitution of the writer, the perception of the reader/interlocutor and the organization of the informational text flow. The research aims to map the semantic-discursive resources of the written register in texts of several genres and produced by students in the first semester of the undergraduate course in Letters. The hypothesis raised is that writing in the academic environment is not a recurrent literacy practice for these learners and can be explained by the ontogenetic and phylogenetic nature of language development. Qualitative in nature, the present research has as empirical data texts produced in a half-yearly course of Reading and Textual Production; these data result from the proposition of four different writing proposals, in a total of 600 texts. The corpus is analyzed based on semantic-discursive resources, seeking to contemplate relevant aspects of language (grammar, discourse and social context) that reveal the choices made in the reader/writer interrelationship and the organizational flow of the Text. Among the semantic-discursive resources, the analysis includes three resources, including (a) appraisal and negotiation to understand the attitudes negotiated (roles of the participants of the discourse and their relationship with the other); (b) ideation to explain the construction of the experience (activities performed and participants); and (c) periodicity to outline the flow of information in the organization of the text according to the genre it instantiates. The results indicate the organizational difficulties of the flow of the text information. Cartography contributes to the understanding of the way writers use language in an effort to present themselves, evaluate someone else’s work, and communicate with readers.

Keywords: academic writing, Portuguese mother tongue, semantic-discursive resources, academic context

Procedia PDF Downloads 94
27009 The Culture of Journal Writing among Manobo Senior High School Students

Authors: Jessevel Montes

Abstract:

This study explored on the culture of journal writing among the Senior High School Manobo students. The purpose of this qualitative morpho-semantic and syntactic study was to discover the morphological, semantic, and syntactic features of the written output through morphological, semantic, and syntactic categories present in their journal writings. Also, beliefs and practices embedded in the norms, values, and ideologies were identified. The study was conducted among the Manobo students in the Senior High Schools of Central Mindanao, particularly in the Division of North Cotabato. Findings revealed that morphologically, the features that flourished are the following: subject-verb concordance, tenses, pronouns, prepositions, articles, and the use of adjectives. Semantically, the features are the following: word choice, idiomatic expression, borrowing, and vernacular. Syntactically, the features are the types of sentences according to structure and function; and the dominance of code switching and run-on sentences. Lastly, as to the beliefs and practices embedded in the norms, values, and ideologies of their journal writing, the major themes are: valuing education, family, and friends as treasure, preservation of culture, and emancipation from the bondage of poverty. This study has shed light on the writing capabilities and weaknesses of the Manobo students when it comes to English language. Further, such an insight into language learning problems is useful to teachers because it provides information on common trouble-spots in language learning, which can be used in the preparation of effective teaching materials.

Keywords: applied linguistics, culture, morpho-semantic and syntactic analysis, Manobo Senior High School, Philippines

Procedia PDF Downloads 92
27008 Semantic Processing in Chinese: Category Effects, Task Effects and Age Effects

Authors: Yi-Hsiu Lai

Abstract:

The present study aimed to elucidate the nature of semantic processing in Chinese. Language and cognition related to the issue of aging are examined from the perspective of picture naming and category fluency tasks. Twenty Chinese-speaking adults (ranging from 25 to 45 years old) and twenty Chinese-speaking seniors (ranging from 65 to 75 years old) in Taiwan participated in this study. Each of them individually completed two tasks: a picture naming task and a category fluency task. Instruments for the naming task were sixty black-and-white pictures: thirty-five object and twenty-five action pictures. Category fluency task also consisted of two semantic categories – objects (or nouns) and actions (or verbs). Participants were asked to report as many items within a category as possible in one minute. Scores of action fluency and of object fluency were a summation of correct responses in these two categories. Category effects (actions vs. objects) and age effects were examined in these tasks. Objects were further divided into two major types: living objects and non-living objects. Actions were also categorized into two major types: action verbs and process verbs. Reaction time to each picture/question was additionally calculated and analyzed. Results of the category fluency task indicated that the content of information in Chinese seniors was comparatively deteriorated, thus producing smaller number of semantic-lexical items. Significant group difference was also found in the results of reaction time. Category Effect was significant for both Chinese adults and seniors in the semantic fluency task. Findings in the present study helped characterize the nature of semantic processing in Chinese-speaking adults and seniors and contributed to the issue of language and aging.

Keywords: semantic processing, aging, Chinese, category effects

Procedia PDF Downloads 336
27007 Semantic Search Engine Based on Query Expansion with Google Ranking and Similarity Measures

Authors: Ahmad Shahin, Fadi Chakik, Walid Moudani

Abstract:

Our study is about elaborating a potential solution for a search engine that involves semantic technology to retrieve information and display it significantly. Semantic search engines are not used widely over the web as the majorities are still in Beta stage or under construction. Many problems face the current applications in semantic search, the major problem is to analyze and calculate the meaning of query in order to retrieve relevant information. Another problem is the ontology based index and its updates. Ranking results according to concept meaning and its relation with query is another challenge. In this paper, we are offering a light meta-engine (QESM) which uses Google search, and therefore Google’s index, with some adaptations to its returned results by adding multi-query expansion. The mission was to find a reliable ranking algorithm that involves semantics and uses concepts and meanings to rank results. At the beginning, the engine finds synonyms of each query term entered by the user based on a lexical database. Then, query expansion is applied to generate different semantically analogous sentences. These are generated randomly by combining the found synonyms and the original query terms. Our model suggests the use of semantic similarity measures between two sentences. Practically, we used this method to calculate semantic similarity between each query and the description of each page’s content generated by Google. The generated sentences are sent to Google engine one by one, and ranked again all together with the adapted ranking method (QESM). Finally, our system will place Google pages with higher similarities on the top of the results. We have conducted experimentations with 6 different queries. We have observed that most ranked results with QESM were altered with Google’s original generated pages. With our experimented queries, QESM generates frequently better accuracy than Google. In some worst cases, it behaves like Google.

Keywords: semantic search engine, Google indexing, query expansion, similarity measures

Procedia PDF Downloads 402
27006 Exploring Public Opinions Toward the Use of Generative Artificial Intelligence Chatbot in Higher Education: An Insight from Topic Modelling and Sentiment Analysis

Authors: Samer Muthana Sarsam, Abdul Samad Shibghatullah, Chit Su Mon, Abd Aziz Alias, Hosam Al-Samarraie

Abstract:

Generative Artificial Intelligence chatbots (GAI chatbots) have emerged as promising tools in various domains, including higher education. However, their specific role within the educational context and the level of legal support for their implementation remain unclear. Therefore, this study aims to investigate the role of Bard, a newly developed GAI chatbot, in higher education. To achieve this objective, English tweets were collected from Twitter's free streaming Application Programming Interface (API). The Latent Dirichlet Allocation (LDA) algorithm was applied to extract latent topics from the collected tweets. User sentiments, including disgust, surprise, sadness, anger, fear, joy, anticipation, and trust, as well as positive and negative sentiments, were extracted using the NRC Affect Intensity Lexicon and SentiStrength tools. This study explored the benefits, challenges, and future implications of integrating GAI chatbots in higher education. The findings shed light on the potential power of such tools, exemplified by Bard, in enhancing the learning process and providing support to students throughout their educational journey.

Keywords: generative artificial intelligence chatbots, bard, higher education, topic modelling, sentiment analysis

Procedia PDF Downloads 52
27005 An Ontology for Semantic Enrichment of RFID Systems

Authors: Haitham S. Hamza, Mohamed Maher, Shourok Alaa, Aya Khattab, Hadeal Ismail, Kamilia Hosny

Abstract:

Radio Frequency Identification (RFID) has become a key technology in the margining concept of Internet of Things (IoT). Naturally, business applications would require the deployment of various RFID systems that are developed by different vendors and use various data formats. This heterogeneity poses a real challenge in developing large-scale IoT systems with RFID as integration is becoming very complex and challenging. Semantic integration is a key approach to deal with this challenge. To do so, ontology for RFID systems need to be developed in order to annotated semantically RFID systems, and hence, facilitate their integration. Accordingly, in this paper, we propose ontology for RFID systems. The proposed ontology can be used to semantically enrich RFID systems, and hence, improve their usage and reasoning. The usage of the proposed ontology is explained through a simple scenario in the health care domain.

Keywords: RFID, semantic technology, ontology, sparql query language, heterogeneity

Procedia PDF Downloads 443
27004 New Ways of Vocabulary Enlargement

Authors: S. Pesina, T. Solonchak

Abstract:

Lexical invariants, being a sort of stereotypes within the frames of ordinary consciousness, are created by the members of a language community as a result of uniform division of reality. The invariant meaning is formed in person’s mind gradually in the course of different actualizations of secondary meanings in various contexts. We understand lexical the invariant as abstract language essence containing a set of semantic components. In one of its configurations it is the basis or all or a number of the meanings making up the semantic structure of the word.

Keywords: lexical invariant, invariant theories, polysemantic word, cognitive linguistics

Procedia PDF Downloads 288
27003 An Online Adaptive Thresholding Method to Classify Google Trends Data Anomalies for Investor Sentiment Analysis

Authors: Duygu Dere, Mert Ergeneci, Kaan Gokcesu

Abstract:

Google Trends data has gained increasing popularity in the applications of behavioral finance, decision science and risk management. Because of Google’s wide range of use, the Trends statistics provide significant information about the investor sentiment and intention, which can be used as decisive factors for corporate and risk management fields. However, an anomaly, a significant increase or decrease, in a certain query cannot be detected by the state of the art applications of computation due to the random baseline noise of the Trends data, which is modelled as an Additive white Gaussian noise (AWGN). Since through time, the baseline noise power shows a gradual change an adaptive thresholding method is required to track and learn the baseline noise for a correct classification. To this end, we introduce an online method to classify meaningful deviations in Google Trends data. Through extensive experiments, we demonstrate that our method can successfully classify various anomalies for plenty of different data.

Keywords: adaptive data processing, behavioral finance , convex optimization, online learning, soft minimum thresholding

Procedia PDF Downloads 136
27002 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah

Abstract:

Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.

Keywords: dimensionality reduction, hyperspectral image, semantic interpretation, spatial hypergraph

Procedia PDF Downloads 287
27001 Pali-Sanskrit Terms and Their Uses in Reflecting Political Society of Thailand

Authors: Kowit Pimpuang

Abstract:

Through analysis of the Pali-Sanskrit (PL-SKT) terms and their uses in reflecting political society of Thailand, the objectives of this study were to explore PL-SKT word formation and its semantic changes employed in the political society of Thailand and to explore the political reflection of Thai society through their uses. Conceptual framework of this study consists of (1) use of PL-SKT word formation namely, primary derivative (Kitaka), secondary derivative (Tathita), compound (Samasa) and prefix (Upasagga), (2) semantic changes namely; widening, narrowing and transferring of meaning, and (3) political reflection of Thai society. Qualitative method was employed in this study and data were collected from Thai Newspapers. It was found that there were uses of the four kinds of word formation in formatting the new political terms concerned namely, primary derivative, secondary derivative, compound and prefix leading by compound through the following three semantic changes; widening, narrowing and transferring, in order to make clear in understanding. Furthermore, PL-SKT terms were employed in reflecting Thai politics caused by democratic conflicts through the bureaucracy, plutocracy, businessocracy and juristocracy respectively. Later, there have been political business groups and their corruption problems in political society of Thailand.

Keywords: Pali, Sanskrit, reflection, politics, Thailand

Procedia PDF Downloads 253
27000 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 46
26999 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution

Authors: Haiyan Wu, Ying Liu, Shaoyun Shi

Abstract:

Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.

Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction

Procedia PDF Downloads 106
26998 Cerrado and Vereda: A Survey of Portuguese Lexicon for Brazilian Biomes

Authors: Daniel Marra

Abstract:

This paper analyses from a semantic-diachronic viewpoint the change of meanings that two lexical items of Brazilian-Portuguese language have gone through. Cerrado and Vereda designate currently the second largest Brazilian biome and one of its most important subsystems. Nevertheless, these two words have long individual histories that can be traced back to their Latin etymons. Therefore, the purpose of this work is to highlight the process by which meaning instantiated itself in these words’ formation and to discuss how semantic change installed subsequently in them. As this paper shows, the aforementioned words have been, in different past, synchronizes, created, and undergone changes of meanings by metaphor and metonymy. Besides, it is argued here that semantic change takes place due to external causes, such as generalization and specialization of meaning. It happens when a specialized use of a lexical item, restricted to a particular linguistic group, is adopted by other groups, having its meaning generalized by them. In these processes, the etymological idea of the word is generally lost, which gains, in the new group, less specific meaning in relation to its etymology, sometimes with no relation to the original idea. As a final point, it is claimed that both the creation of a lexical item and its change of meaning involve pragmatic goals, such as the need the language users have to express a new meaning related to a certain reality in the empirical world.

Keywords: Brazilian biomes, metaphor and metonymy, Portuguese lexicon, semantic change

Procedia PDF Downloads 97
26997 An AI-generated Semantic Communication Platform in HCI Course

Authors: Yi Yang, Jiasong Sun

Abstract:

Almost every aspect of our daily lives is now intertwined with some degree of human-computer interaction (HCI). HCI courses draw on knowledge from disciplines as diverse as computer science, psychology, design principles, anthropology, and more. Our HCI courses, named the Media and Cognition course, are constantly updated to reflect state-of-the-art technological advancements such as virtual reality, augmented reality, and artificial intelligence-based interactions. For more than a decade, our course has used an interest-based approach to teaching, in which students proactively propose some research-based questions and collaborate with teachers, using course knowledge to explore potential solutions. Semantic communication plays a key role in facilitating understanding and interaction between users and computer systems, ultimately enhancing system usability and user experience. The advancements in AI-generated technology, which have gained significant attention from both academia and industry in recent years, are exemplified by language models like GPT-3 that generate human-like dialogues from given prompts. Our latest version of the Human-Computer Interaction course practices a semantic communication platform based on AI-generated techniques. The purpose of this semantic communication is twofold: to extract and transmit task-specific information while ensuring efficient end-to-end communication with minimal latency. An AI-generated semantic communication platform evaluates the retention of signal sources and converts low-retain ability visual signals into textual prompts. These data are transmitted through AI-generated techniques and reconstructed at the receiving end; on the other hand, visual signals with a high retain ability rate are compressed and transmitted according to their respective regions. The platform and associated research are a testament to our students' growing ability to independently investigate state-of-the-art technologies.

Keywords: human-computer interaction, media and cognition course, semantic communication, retainability, prompts

Procedia PDF Downloads 79
26996 Understanding the Interactive Nature in Auditory Recognition of Phonological/Grammatical/Semantic Errors at the Sentence Level: An Investigation Based upon Japanese EFL Learners’ Self-Evaluation and Actual Language Performance

Authors: Hirokatsu Kawashima

Abstract:

One important element of teaching/learning listening is intensive listening such as listening for precise sounds, words, grammatical, and semantic units. Several classroom-based investigations have been conducted to explore the usefulness of auditory recognition of phonological, grammatical and semantic errors in such a context. The current study reports the results of one such investigation, which targeted auditory recognition of phonological, grammatical, and semantic errors at the sentence level. 56 Japanese EFL learners participated in this investigation, in which their recognition performance of phonological, grammatical and semantic errors was measured on a 9-point scale by learners’ self-evaluation from the perspective of 1) two types of similar English sound (vowel and consonant minimal pair words), 2) two types of sentence word order (verb phrase-based and noun phrase-based word orders), and 3) two types of semantic consistency (verb-purpose and verb-place agreements), respectively, and their general listening proficiency was examined using standardized tests. A number of findings have been made about the interactive relationships between the three types of auditory error recognition and general listening proficiency. Analyses based on the OPLS (Orthogonal Projections to Latent Structure) regression model have disclosed, for example, that the three types of auditory error recognition are linked in a non-linear way: the highest explanatory power for general listening proficiency may be attained when quadratic interactions between auditory recognition of errors related to vowel minimal pair words and that of errors related to noun phrase-based word order are embraced (R2=.33, p=.01).

Keywords: auditory error recognition, intensive listening, interaction, investigation

Procedia PDF Downloads 489
26995 Capturing Public Voices: The Role of Social Media in Heritage Management

Authors: Mahda Foroughi, Bruno de Anderade, Ana Pereira Roders

Abstract:

Social media platforms have been increasingly used by locals and tourists to express their opinions about buildings, cities, and built heritage in particular. Most recently, scholars have been using social media to conduct innovative research on built heritage and heritage management. Still, the application of artificial intelligence (AI) methods to analyze social media data for heritage management is seldom explored. This paper investigates the potential of short texts (sentences and hashtags) shared through social media as a data source and artificial intelligence methods for data analysis for revealing the cultural significance (values and attributes) of built heritage. The city of Yazd, Iran, was taken as a case study, with a particular focus on windcatchers, key attributes conveying outstanding universal values, as inscribed on the UNESCO World Heritage List. This paper has three subsequent phases: 1) state of the art on the intersection of public participation in heritage management and social media research; 2) methodology of data collection and data analysis related to coding people's voices from Instagram and Twitter into values of windcatchers over the last ten-years; 3) preliminary findings on the comparison between opinions of locals and tourists, sentiment analysis, and its association with the values and attributes of windcatchers. Results indicate that the age value is recognized as the most important value by all interest groups, while the political value is the least acknowledged. Besides, the negative sentiments are scarcely reflected (e.g., critiques) in social media. Results confirm the potential of social media for heritage management in terms of (de)coding and measuring the cultural significance of built heritage for windcatchers in Yazd. The methodology developed in this paper can be applied to other attributes in Yazd and also to other case studies.

Keywords: social media, artificial intelligence, public participation, cultural significance, heritage, sentiment analysis

Procedia PDF Downloads 88
26994 Multimodal Discourse, Logic of the Analysis of Transmedia Strategies

Authors: Bianca Suárez Puerta

Abstract:

Multimodal discourse refers to a method of study the media continuum between reality, screens as a device, audience, author, and media as a production from the audience. For this study we used semantic differential, a method proposed in the sixties by Osgood, Suci and Tannenbaum, starts from the assumption that under each particular way of perceiving the world, in each singular idea, there is a common cultural meaning that organizes experiences. In relation to these shared symbolic dimension, this method has had significant results, as it focuses on breaking down the meaning of certain significant acts into series of statements that place the subjects in front of some concepts. In Colombia, in 2016, a tool was designed to measure the meaning of a multimodal production, specially the acts of sense of transmedia productions that managed to receive funds from the Ministry of ICT of Colombia, and also, to analyze predictable patterns that can be found in calls and funds aimed at the production of culture in Colombia, in the context of the peace agreement, as a request for expressions from a hegemonic place, seeking to impose a worldview.

Keywords: semantic differential, semiotics, transmedia, critical analysis of discourse

Procedia PDF Downloads 184
26993 Aspects of Semantics of Standard British English and Nigerian English: A Contrastive Study

Authors: Chris Adetuyi, Adeola Adeniran

Abstract:

The concept of meaning is a complex one in language study when cultural features are added. This is mandatory because language cannot be completely separated from the culture in which case language and culture complement each other. When there are two varieties of a language in a society, i.e. two varieties functioning side by side in a speech community, there is a tendency to view one of the varieties with each other. There is, therefore, the need to make a linguistic comparative study of varieties of such languages. In this paper, a semantic contrastive study is made between Standard British English (SBE) and Nigerian English (NB). The semantic study is limited to aspects of semantics: semantic extension (Kinship terms, metaphors), semantic shift (lexical items considered are ‘drop’ ‘befriend’ ‘dowry’ and escort) acronyms (NEPA, JAMB, NTA) linguistic borrowing or loan words (Seriki, Agbada, Eba, Dodo, Iroko) coinages (long leg, bush meat; bottom power and juju). In the study of these aspects of semantics of SBE and NE lexical terms, conservative statements are made, problems areas and hierarchy of difficulties are highlighted with a view to bringing out areas of differences are highlighted in this paper are concerned. The study will also serve as a guide in further contrastive studies in some other area of languages.

Keywords: aspect, British, English, Nigeria, semantics

Procedia PDF Downloads 322
26992 A Semantic Registry to Support Brazilian Aeronautical Web Services Operations

Authors: Luís Antonio de Almeida Rodriguez, José Maria Parente de Oliveira, Ednelson Oliveira

Abstract:

In the last two decades, the world’s aviation authorities have made several attempts to create consensus about a global and accepted approach for applying semantics to web services registry descriptions. This problem has led communities to face a fat and disorganized infrastructure to describe aeronautical web services. It is usual for developers to implement ad-hoc connections among consumers and providers and manually create non-standardized service compositions, which need some particular approach to compose and semantically discover a desired web service. Current practices are not precise and tend to focus on lightweight specifications of some parts of the OWL-S and embed them into syntactic descriptions (SOAP artifacts and OWL language). It is necessary to have the ability to manage the use of both technologies. This paper presents an implementation of the ontology OWL-S that describes a Brazilian Aeronautical Web Service Registry, which makes it able to publish, advertise, make multi-criteria semantic discovery aligned with the ideas of the System Wide Information Management (SWIM) Program, and invoke web services within the Air Traffic Management context. The proposal’s best finding is a generic approach to describe semantic web services. The paper also presents a set of functional requirements to guide the ontology development and to compare them to the results to validate the implementation of the OWL-S Ontology.

Keywords: aeronautical web services, OWL-S, semantic web services discovery, ontologies

Procedia PDF Downloads 61
26991 Semantic Based Analysis in Complaint Management System with Analytics

Authors: Francis Alterado, Jennifer Enriquez

Abstract:

Semantic Based Analysis in Complaint Management System with Analytics is an enhanced tool of providing complaints by the clients as well as a mechanism for Palawan Polytechnic College to gather, process, and monitor status of these complaints. The study has a mobile application that serves as a remote facility of communication between the students and the school management on the issues encountered by the student and the solution of every complaint received. In processing the complaints, text mining and clustering algorithms were utilized. Every module of the systems was tested and based on the results; these are 100% free from error before integration was done. A system testing was also done by checking the expected functionality of the system which was 100% functional. The system was tested by 10 students by forwarding complaints to 10 departments. Based on results, the students were able to submit complaints, the system was able to process accordingly by identifying to which department the complaints are intended, and the concerned department was able to give feedback on the complaint received to the student. With this, the system gained 4.7 rating which means Excellent.

Keywords: technology adoption, emerging technology, issues challenges, algorithm, text mining, mobile technology

Procedia PDF Downloads 174
26990 Volatility Index, Fear Sentiment and Cross-Section of Stock Returns: Indian Evidence

Authors: Pratap Chandra Pati, Prabina Rajib, Parama Barai

Abstract:

The traditional finance theory neglects the role of sentiment factor in asset pricing. However, the behavioral approach to asset-pricing based on noise trader model and limit to arbitrage includes investor sentiment as a priced risk factor in the assist pricing model. Investor sentiment affects stock more that are vulnerable to speculation, hard to value and risky to arbitrage. It includes small stocks, high volatility stocks, growth stocks, distressed stocks, young stocks and non-dividend-paying stocks. Since the introduction of Chicago Board Options Exchange (CBOE) volatility index (VIX) in 1993, it is used as a measure of future volatility in the stock market and also as a measure of investor sentiment. CBOE VIX index, in particular, is often referred to as the ‘investors’ fear gauge’ by public media and prior literature. The upward spikes in the volatility index are associated with bouts of market turmoil and uncertainty. High levels of the volatility index indicate fear, anxiety and pessimistic expectations of investors about the stock market. On the contrary, low levels of the volatility index reflect confident and optimistic attitude of investors. Based on the above discussions, we investigate whether market-wide fear levels measured volatility index is priced factor in the standard asset pricing model for the Indian stock market. First, we investigate the performance and validity of Fama and French three-factor model and Carhart four-factor model in the Indian stock market. Second, we explore whether India volatility index as a proxy for fearful market-based sentiment indicators affect the cross section of stock returns after controlling for well-established risk factors such as market excess return, size, book-to-market, and momentum. Asset pricing tests are performed using monthly data on CNX 500 index constituent stocks listed on the National stock exchange of India Limited (NSE) over the sample period that extends from January 2008 to March 2017. To examine whether India volatility index, as an indicator of fear sentiment, is a priced risk factor, changes in India VIX is included as an explanatory variable in the Fama-French three-factor model as well as Carhart four-factor model. For the empirical testing, we use three different sets of test portfolios used as the dependent variable in the in asset pricing regressions. The first portfolio set is the 4x4 sorts on the size and B/M ratio. The second portfolio set is the 4x4 sort on the size and sensitivity beta of change in IVIX. The third portfolio set is the 2x3x2 independent triple-sorting on size, B/M and sensitivity beta of change in IVIX. We find evidence that size, value and momentum factors continue to exist in Indian stock market. However, VIX index does not constitute a priced risk factor in the cross-section of returns. The inseparability of volatility and jump risk in the VIX is a possible explanation of the current findings in the study.

Keywords: India VIX, Fama-French model, Carhart four-factor model, asset pricing

Procedia PDF Downloads 227
26989 Evaluation and Compression of Different Language Transformer Models for Semantic Textual Similarity Binary Task Using Minority Language Resources

Authors: Ma. Gracia Corazon Cayanan, Kai Yuen Cheong, Li Sha

Abstract:

Training a language model for a minority language has been a challenging task. The lack of available corpora to train and fine-tune state-of-the-art language models is still a challenge in the area of Natural Language Processing (NLP). Moreover, the need for high computational resources and bulk data limit the attainment of this task. In this paper, we presented the following contributions: (1) we introduce and used a translation pair set of Tagalog and English (TL-EN) in pre-training a language model to a minority language resource; (2) we fine-tuned and evaluated top-ranking and pre-trained semantic textual similarity binary task (STSB) models, to both TL-EN and STS dataset pairs. (3) then, we reduced the size of the model to offset the need for high computational resources. Based on our results, the models that were pre-trained to translation pairs and STS pairs can perform well for STSB task. Also, having it reduced to a smaller dimension has no negative effect on the performance but rather has a notable increase on the similarity scores. Moreover, models that were pre-trained to a similar dataset have a tremendous effect on the model’s performance scores.

Keywords: semantic matching, semantic textual similarity binary task, low resource minority language, fine-tuning, dimension reduction, transformer models

Procedia PDF Downloads 180
26988 Emotion Mining and Attribute Selection for Actionable Recommendations to Improve Customer Satisfaction

Authors: Jaishree Ranganathan, Poonam Rajurkar, Angelina A. Tzacheva, Zbigniew W. Ras

Abstract:

In today’s world, business often depends on the customer feedback and reviews. Sentiment analysis helps identify and extract information about the sentiment or emotion of the of the topic or document. Attribute selection is a challenging problem, especially with large datasets in actionable pattern mining algorithms. Action Rule Mining is one of the methods to discover actionable patterns from data. Action Rules are rules that help describe specific actions to be made in the form of conditions that help achieve the desired outcome. The rules help to change from any undesirable or negative state to a more desirable or positive state. In this paper, we present a Lexicon based weighted scheme approach to identify emotions from customer feedback data in the area of manufacturing business. Also, we use Rough sets and explore the attribute selection method for large scale datasets. Then we apply Actionable pattern mining to extract possible emotion change recommendations. This kind of recommendations help business analyst to improve their customer service which leads to customer satisfaction and increase sales revenue.

Keywords: actionable pattern discovery, attribute selection, business data, data mining, emotion

Procedia PDF Downloads 172