Search results for: urdu sentiment analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 27890

Search results for: urdu sentiment analysis

27830 Bidirectional Encoder Representations from Transformers Sentiment Analysis Applied to Three Presidential Pre-Candidates in Costa Rica

Authors: Félix David Suárez Bonilla

Abstract:

A sentiment analysis service to detect polarity (positive, neural, and negative), based on transfer learning, was built using a Spanish version of BERT and applied to tweets written in Spanish. The dataset that was used consisted of 11975 reviews, which were extracted from Google Play using the google-play-scrapper package. The BETO trained model used: the AdamW optimizer, a batch size of 16, a learning rate of 2x10⁻⁵ and 10 epochs. The system was tested using tweets of three presidential pre-candidates from Costa Rica. The system was finally validated using human labeled examples, achieving an accuracy of 83.3%.

Keywords: NLP, transfer learning, BERT, sentiment analysis, social media, opinion mining

Procedia PDF Downloads 174
27829 A Newspapers Expectations Indicator from Web Scraping

Authors: Pilar Rey del Castillo

Abstract:

This document describes the building of an average indicator of the general sentiments about the future exposed in the newspapers in Spain. The raw data are collected through the scraping of the Digital Periodical and Newspaper Library website. Basic tools of natural language processing are later applied to the collected information to evaluate the sentiment strength of each word in the texts using a polarized dictionary. The last step consists of summarizing these sentiments to produce daily indices. The results are a first insight into the applicability of these techniques to produce periodic sentiment indicators.

Keywords: natural language processing, periodic indicator, sentiment analysis, web scraping

Procedia PDF Downloads 132
27828 Sentiment Classification of Documents

Authors: Swarnadip Ghosh

Abstract:

Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.

Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation

Procedia PDF Downloads 402
27827 Kitchenary Metaphors In Hindi-urdu: A Cognitive Analysis

Authors: Bairam Khan, Premlata Vaishnava

Abstract:

The ability to conceptualize one entity in terms of another allows us to communicate through metaphors. This central feature of human cognition has evolved with the development of language, and the processing of metaphors is without any conscious appraisal and is quite effortless. South Asians, like other speech communities, have been using the kitchenary [culinary] metaphor in a very simple yet interesting way and are known for bringing into new and unique constellations wherever they are. This composite feature of our language is used to communicate in a precise and compact manner and maneuvers the expression. The present study explores the role of kitchenary metaphors in the making and shaping of idioms by applying Cognitive Metaphor Theories. Drawing on examples from a corpus of adverts, print, and electronic media, the study looks at the metaphorical language used by real people in real situations. The overarching theme throughout the course is that kitchenary metaphors are powerful tools of expression in Hindi-Urdu.

Keywords: cognitive metaphor theory, source domain, target domain, signifier- signified, kitchenary, ethnocultural elements of south asia and hindi- urdu language

Procedia PDF Downloads 77
27826 Forecast Dispersion, Investor Sentiment and the Cross Section of Stock Returns

Authors: Guoyu Lin

Abstract:

This paper explores the role investor sentiment plays in the relationship between analyst forecast dispersion and stock returns. With short sale constraints, stock prices are determined by the optimistic investors. During the high sentiment periods when investors suffer more from psychological bias, there are more optimistic investors. This is the first paper to document that following the high sentiment periods, stocks with the most analyst forecast dispersion are overpriced, earning significantly negative returns, while those with the least analyst forecast dispersion are not overpriced as the degree of belief dispersion is low. However, following the low sentiment periods, both are not overpriced. A portfolio which longs the least dispersed stocks and shorts the most dispersed stocks yields significantly positive returns only following the high sentiment periods. My findings can potentially reconcile the puzzling risk effect and mispricing effect in the literature. The risk (mispricing) effect suggests a positive (negative) relation between analyst forecast dispersion and future stock returns. Presumably, the magnitude of the mispricing effect depends on the proportion of irrational investors and their bias, which is positively related to investor sentiment. During the high sentiment period, the mispricing effect takes over and the overall effect is negative. During the low sentiment period, the percentage of irrational investors is mediate, and the mispricing effect and the risk effect counter each other, leading to insignificant relation.

Keywords: analyst forecast dispersion, short-sale constraints, investor sentiment, stock returns

Procedia PDF Downloads 142
27825 Artificial Intelligence Assisted Sentiment Analysis of Hotel Reviews Using Topic Modeling

Authors: Sushma Ghogale

Abstract:

With a surge in user-generated content or feedback or reviews on the internet, it has become possible and important to know consumers' opinions about products and services. This data is important for both potential customers and businesses providing the services. Data from social media is attracting significant attention and has become the most prominent channel of expressing an unregulated opinion. Prospective customers look for reviews from experienced customers before deciding to buy a product or service. Several websites provide a platform for users to post their feedback for the provider and potential customers. However, the biggest challenge in analyzing such data is in extracting latent features and providing term-level analysis of the data. This paper proposes an approach to use topic modeling to classify the reviews into topics and conduct sentiment analysis to mine the opinions. This approach can analyse and classify latent topics mentioned by reviewers on business sites or review sites, or social media using topic modeling to identify the importance of each topic. It is followed by sentiment analysis to assess the satisfaction level of each topic. This approach provides a classification of hotel reviews using multiple machine learning techniques and comparing different classifiers to mine the opinions of user reviews through sentiment analysis. This experiment concludes that Multinomial Naïve Bayes classifier produces higher accuracy than other classifiers.

Keywords: latent Dirichlet allocation, topic modeling, text classification, sentiment analysis

Procedia PDF Downloads 97
27824 Multi-Level Attentional Network for Aspect-Based Sentiment Analysis

Authors: Xinyuan Liu, Xiaojun Jing, Yuan He, Junsheng Mu

Abstract:

Aspect-based Sentiment Analysis (ABSA) has attracted much attention due to its capacity to determine the sentiment polarity of the certain aspect in a sentence. In previous works, great significance of the interaction between aspect and sentence has been exhibited in ABSA. In consequence, a Multi-Level Attentional Networks (MLAN) is proposed. MLAN consists of four parts: Embedding Layer, Encoding Layer, Multi-Level Attentional (MLA) Layers and Final Prediction Layer. Among these parts, MLA Layers including Aspect Level Attentional (ALA) Layer and Interactive Attentional (ILA) Layer is the innovation of MLAN, whose function is to focus on the important information and obtain multiple levels’ attentional weighted representation of aspect and sentence. In the experiments, MLAN is compared with classical TD-LSTM, MemNet, RAM, ATAE-LSTM, IAN, AOA, LCR-Rot and AEN-GloVe on SemEval 2014 Dataset. The experimental results show that MLAN outperforms those state-of-the-art models greatly. And in case study, the works of ALA Layer and ILA Layer have been proven to be effective and interpretable.

Keywords: deep learning, aspect-based sentiment analysis, attention, natural language processing

Procedia PDF Downloads 138
27823 A Case Study of Ontology-Based Sentiment Analysis for Fan Pages

Authors: C. -L. Huang, J. -H. Ho

Abstract:

Social media has become more and more important in our life. Many enterprises promote their services and products to fans via the social media. The positive or negative sentiment of feedbacks from fans is very important for enterprises to improve their products, services, and promotion activities. The purpose of this paper is to understand the sentiment of the fan’s responses by analyzing the responses posted by fans on Facebook. The entity and aspect of fan’s responses were analyzed based on a predefined ontology. The ontology for cell phone sentiment analysis consists of aspect categories on the top level as follows: overall, shape, hardware, brand, price, and service. Each category consists of several sub-categories. All aspects for a fan’s response were found based on the ontology, and their corresponding sentimental terms were found using lexicon-based approach. The sentimental scores for aspects of fan responses were obtained by summarizing the sentimental terms in responses. The frequency of 'like' was also weighted in the sentimental score calculation. Three famous cell phone fan pages on Facebook were selected as demonstration cases to evaluate performances of the proposed methodology. Human judgment by several domain experts was also built for performance comparison. The performances of proposed approach were as good as those of human judgment on precision, recall and F1-measure.

Keywords: opinion mining, ontology, sentiment analysis, text mining

Procedia PDF Downloads 232
27822 Investigating the Effectiveness of Multilingual NLP Models for Sentiment Analysis

Authors: Othmane Touri, Sanaa El Filali, El Habib Benlahmar

Abstract:

Natural Language Processing (NLP) has gained significant attention lately. It has proved its ability to analyze and extract insights from unstructured text data in various languages. It is found that one of the most popular NLP applications is sentiment analysis which aims to identify the sentiment expressed in a piece of text, such as positive, negative, or neutral, in multiple languages. While there are several multilingual NLP models available for sentiment analysis, there is a need to investigate their effectiveness in different contexts and applications. In this study, we aim to investigate the effectiveness of different multilingual NLP models for sentiment analysis on a dataset of online product reviews in multiple languages. The performance of several NLP models, including Google Cloud Natural Language API, Microsoft Azure Cognitive Services, Amazon Comprehend, Stanford CoreNLP, spaCy, and Hugging Face Transformers are being compared. The models based on several metrics, including accuracy, precision, recall, and F1 score, are being evaluated and compared to their performance across different categories of product reviews. In order to run the study, preprocessing of the dataset has been performed by cleaning and tokenizing the text data in multiple languages. Then training and testing each model has been applied using a cross-validation approach where randomly dividing the dataset into training and testing sets and repeating the process multiple times has been used. A grid search approach to optimize the hyperparameters of each model and select the best-performing model for each category of product reviews and language has been applied. The findings of this study provide insights into the effectiveness of different multilingual NLP models for Multilingual Sentiment Analysis and their suitability for different languages and applications. The strengths and limitations of each model were identified, and recommendations for selecting the most performant model based on the specific requirements of a project were provided. This study contributes to the advancement of research methods in multilingual NLP and provides a practical guide for researchers and practitioners in the field.

Keywords: NLP, multilingual, sentiment analysis, texts

Procedia PDF Downloads 102
27821 Gender Difference in the Use of Request Strategies by Urdu/Punjabi Native Speakers

Authors: Muzaffar Hussain

Abstract:

Requests strategies are considered as a part of the speech acts, which are frequently used in everyday communication. Each language provides speech acts to the speakers; therefore, the selection of appropriate form seems more culture-specific rather than language. The present paper investigates the gender-based difference in the use of request strategies by native speakers of Urdu/Punjabi male and female who are learning English as a second language. The data for the present study were collected from 68 graduate students, who are learning English as an L2 in Pakistan. They were given an online close-ended questionnaire, based on Discourse Completion Test (DCT). After analyzing the data, it was found that the L1 male Urdu/Punjabi speakers were inclined to use more direct request strategies while the female Urdu/Punjabi speakers used indirect request strategies. This paper also found that in some situations female participants used more direct strategies than male participants. The present study concludes that the use of request strategies is influenced by culture, social status, and power distribution in a society.

Keywords: gender variation, request strategies, face-threatening, second language pragmatics, language competence

Procedia PDF Downloads 189
27820 Investigating Translations of Websites of Pakistani Public Offices

Authors: Sufia Maroof

Abstract:

This empirical study investigated the web-translations of five Pakistani public offices (FPSC, FIA, HEC, USB, and Ministry of Finance) offering Urdu tab as an option to access information on their official websites. Triangulation of quantitative and qualitative research design informed the researcher of the semantic, lexical and syntactic caveats in these translations. The study hypothesized that majority of the Pakistani population is oblivious of the Supreme Court’s amendments in language policy concerning national and official language; hence, Urdu web-translations of the public departments have not been accessed effectively. Firstly, the researcher conducted an online survey, comprising of two sections, close ended and short answer based questions. Secondly, the researcher compiled corpus of the five selected websites in a tabular form to compare the data. Thirdly, the administrators of the departments had been contacted regarding the methods of translation and the expertise of the personnel involved. The corpus was assessed for TQA after examining the lexical, semantic, syntactical and technical alignment inaccuracies and imperfections. The study suggests the public offices to invest in their Urdu webs by either hiring expert translators or engaging expertise of a translation agency for this project to offer quality translation to public.

Keywords: machine translations, public offices, Urdu translations, websites

Procedia PDF Downloads 126
27819 StockTwits Sentiment Analysis on Stock Price Prediction

Authors: Min Chen, Rubi Gupta

Abstract:

Understanding and predicting stock market movements is a challenging problem. It is believed stock markets are partially driven by public sentiments, which leads to numerous research efforts to predict stock market trend using public sentiments expressed on social media such as Twitter but with limited success. Recently a microblogging website StockTwits is becoming increasingly popular for users to share their discussions and sentiments about stocks and financial market. In this project, we analyze the text content of StockTwits tweets and extract financial sentiment using text featurization and machine learning algorithms. StockTwits tweets are first pre-processed using techniques including stopword removal, special character removal, and case normalization to remove noise. Features are extracted from these preprocessed tweets through text featurization process using bags of words, N-gram models, TF-IDF (term frequency-inverse document frequency), and latent semantic analysis. Machine learning models are then trained to classify the tweets' sentiment as positive (bullish) or negative (bearish). The correlation between the aggregated daily sentiment and daily stock price movement is then investigated using Pearson’s correlation coefficient. Finally, the sentiment information is applied together with time series stock data to predict stock price movement. The experiments on five companies (Apple, Amazon, General Electric, Microsoft, and Target) in a duration of nine months demonstrate the effectiveness of our study in improving the prediction accuracy.

Keywords: machine learning, sentiment analysis, stock price prediction, tweet processing

Procedia PDF Downloads 156
27818 Preparation on Sentimental Analysis on Social Media Comments with Bidirectional Long Short-Term Memory Gated Recurrent Unit and Model Glove in Portuguese

Authors: Leonardo Alfredo Mendoza, Cristian Munoz, Marco Aurelio Pacheco, Manoela Kohler, Evelyn Batista, Rodrigo Moura

Abstract:

Natural Language Processing (NLP) techniques are increasingly more powerful to be able to interpret the feelings and reactions of a person to a product or service. Sentiment analysis has become a fundamental tool for this interpretation but has few applications in languages other than English. This paper presents a classification of sentiment analysis in Portuguese with a base of comments from social networks in Portuguese. A word embedding's representation was used with a 50-Dimension GloVe pre-trained model, generated through a corpus completely in Portuguese. To generate this classification, the bidirectional long short-term memory and bidirectional Gated Recurrent Unit (GRU) models are used, reaching results of 99.1%.

Keywords: natural processing language, sentiment analysis, bidirectional long short-term memory, BI-LSTM, gated recurrent unit, GRU

Procedia PDF Downloads 159
27817 Overview and Future Opportunities of Sarcasm Detection on Social Media Communications

Authors: Samaneh Nadali, Masrah Azrifah Azmi Murad, Nurfadhlina Mohammad Sharef

Abstract:

Sarcasm is a common phenomenon in social media which is a nuanced form of language for stating the opposite of what is implied. Due to the intentional ambiguity, analysis of sarcasm is a difficult task not only for a machine but even for a human. Although sarcasm detection has an important effect on sentiment, it is usually ignored in social media analysis because sarcasm analysis is too complicated. While there is a few systems exist which can detect sarcasm, almost no work has been carried out on a study and the review of the existing work in this area. This survey presents a nearly full image of sarcasm detection techniques and the related fields with brief details. The main contributions of this paper include the illustration of the recent trend of research in the sarcasm analysis and we highlight the gaps and propose a new framework that can be explored.

Keywords: sarcasm detection, sentiment analysis, social media, sarcasm analysis

Procedia PDF Downloads 457
27816 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis

Procedia PDF Downloads 88
27815 The Potential of Sentiment Analysis to Categorize Social Media Comments Using German Libraries

Authors: Felix Boehnisch, Alexander Lutz

Abstract:

Based on the number of users and the amount of content posted daily, Facebook is considered the largest social network in the world. This content includes images or text posts from companies but also private persons, which are also commented on by other users. However, it can sometimes be difficult for companies to keep track of all the posts and the reactions to them, especially when there are several posts a day that contain hundreds to thousands of comments. To facilitate this, the following paper deals with the possible applications of sentiment analysis to social media comments in order to be able to support the work in social media marketing. In a first step, post comments were divided into positive and negative by a subjective rating, then the same comments were checked for their polarity value by the two german python libraries TextBlobDE and SentiWS and also grouped into positive, negative, or even neutral. As a control, the subjective classifications were compared with the machine-generated ones by a confusion matrix, and relevant quality criteria were determined. The accuracy of both libraries was not really meaningful, with 60% to 66%. However, many words or sentences were not evaluated at all, so there seems to be room for optimization to possibly get more accurate results. In future studies, the use of these specific German libraries can be optimized to gain better insights by either applying them to stricter cleaned data or by adding a sentiment value to emojis, which have been removed from the comments in advance, as they are not contained in the libraries.

Keywords: Facebook, German libraries, polarity, sentiment analysis, social media comments

Procedia PDF Downloads 182
27814 Automatic Lead Qualification with Opinion Mining in Customer Relationship Management Projects

Authors: Victor Radich, Tania Basso, Regina Moraes

Abstract:

Lead qualification is one of the main procedures in Customer Relationship Management (CRM) projects. Its main goal is to identify potential consumers who have the ideal characteristics to establish a profitable and long-term relationship with a certain organization. Social networks can be an important source of data for identifying and qualifying leads since interest in specific products or services can be identified from the users’ expressed feelings of (dis)satisfaction. In this context, this work proposes the use of machine learning techniques and sentiment analysis as an extra step in the lead qualification process in order to improve it. In addition to machine learning models, sentiment analysis or opinion mining can be used to understand the evaluation that the user makes of a particular service, product, or brand. The results obtained so far have shown that it is possible to extract data from social networks and combine the techniques for a more complete classification.

Keywords: lead qualification, sentiment analysis, opinion mining, machine learning, CRM, lead scoring

Procedia PDF Downloads 85
27813 Tweets to Touchdowns: Predicting National Football League Achievement from Social Media Optimism

Authors: Rohan Erasala, Ian McCulloh

Abstract:

The NFL Draft is a chance for every NFL team to select their next superstar. As a result, teams heavily invest in scouting, and millions of fans partake in the online discourse surrounding the draft. This paper investigates the potential correlations between positive sentiment in individual draft selection threads from the subreddit r/NFL and if this data can be used to make successful player recommendations. It is hypothesized that there will be limited correlations and nonviable recommendations made from these threads. The hypothesis is tested using sentiment analysis of draft thread comments and analyzing correlation and precision at k of top scores. The results indicate weak correlations between the percentage of positive comments in a draft selection thread and a player’s approximate value, but potentially viable recommendations from looking at players whose draft selection threads have the highest percentage of positive comments.

Keywords: national football league, NFL, NFL Draft, sentiment analysis, Reddit, social media, NLP

Procedia PDF Downloads 85
27812 The Fefe Indices: The Direction of Donal Trump’s Tweets Effect on the Stock Market

Authors: Sergio Andres Rojas, Julian Benavides Franco, Juan Tomas Sayago

Abstract:

An increasing amount of research demonstrates how market mood affects financial markets, but their primary goal is to demonstrate how Trump's tweets impacted US interest rate volatility. Following that lead, this work evaluates the effect that Trump's tweets had during his presidency on local and international stock markets, considering not just volatility but the direction of the movement. Three indexes for Trump's tweets were created relating his activity with movements in the S&P500 using natural language analysis and machine learning algorithms. The indexes consider Trump's tweet activity and the positive or negative market sentiment they might inspire. The first explores the relationship between tweets generating negative movements in the S&P500; the second explores positive movements, while the third explores the difference between up and down movements. A pseudo-investment strategy using the indexes produced statistically significant above-average abnormal returns. The findings also showed that the pseudo strategy generated a higher return in the local market if applied to intraday data. However, only a negative market sentiment caused this effect on daily data. These results suggest that the market reacted primarily to a negative idea reflected in the negative index. In the international market, it is not possible to identify a pervasive effect. A rolling window regression model was also performed. The result shows that the impact on the local and international markets is heterogeneous, time-changing, and differentiated for the market sentiment. However, the negative sentiment was more prone to have a significant correlation most of the time.

Keywords: market sentiment, Twitter market sentiment, machine learning, natural dialect analysis

Procedia PDF Downloads 63
27811 Real Time Classification of Political Tendency of Twitter Spanish Users based on Sentiment Analysis

Authors: Marc Solé, Francesc Giné, Magda Valls, Nina Bijedic

Abstract:

What people say on social media has turned into a rich source of information to understand social behavior. Specifically, the growing use of Twitter social media for political communication has arisen high opportunities to know the opinion of large numbers of politically active individuals in real time and predict the global political tendencies of a specific country. It has led to an increasing body of research on this topic. The majority of these studies have been focused on polarized political contexts characterized by only two alternatives. Unlike them, this paper tackles the challenge of forecasting Spanish political trends, characterized by multiple political parties, by means of analyzing the Twitters Users political tendency. According to this, a new strategy, named Tweets Analysis Strategy (TAS), is proposed. This is based on analyzing the users tweets by means of discovering its sentiment (positive, negative or neutral) and classifying them according to the political party they support. From this individual political tendency, the global political prediction for each political party is calculated. In order to do this, two different strategies for analyzing the sentiment analysis are proposed: one is based on Positive and Negative words Matching (PNM) and the second one is based on a Neural Networks Strategy (NNS). The complete TAS strategy has been performed in a Big-Data environment. The experimental results presented in this paper reveal that NNS strategy performs much better than PNM strategy to analyze the tweet sentiment. In addition, this research analyzes the viability of the TAS strategy to obtain the global trend in a political context make up by multiple parties with an error lower than 23%.

Keywords: political tendency, prediction, sentiment analysis, Twitter

Procedia PDF Downloads 238
27810 Fuzzy Sentiment Analysis of Customer Product Reviews

Authors: Samaneh Nadali, Masrah Azrifah Azmi Murad

Abstract:

As a result of the growth of the web, people are able to express their views and opinions. They can now post reviews of products at merchant sites and express their views on almost anything in internet forums, discussion groups, and blogs. Therefore, the number of product reviews has grown rapidly. The large numbers of reviews make it difficult for manufacturers or businesses to automatically classify them into different semantic orientations (positive, negative, and neutral). For sentiment classification, most existing methods utilize a list of opinion words whereas this paper proposes a fuzzy approach for evaluating sentiments expressed in customer product reviews, to predict the strength levels (e.g. very weak, weak, moderate, strong and very strong) of customer product reviews by combinations of adjective, adverb and verb. The proposed fuzzy approach has been tested on eight benchmark datasets and obtained 74% accuracy, which leads to help the organization with a more clear understanding of customer's behavior in support of business planning process.

Keywords: fuzzy logic, customer product review, sentiment analysis

Procedia PDF Downloads 363
27809 Information Disclosure And Financial Sentiment Index Using a Machine Learning Approach

Authors: Alev Atak

Abstract:

In this paper, we aim to create a financial sentiment index by investigating the company’s voluntary information disclosures. We retrieve structured content from BIST 100 companies’ financial reports for the period 1998-2018 and extract relevant financial information for sentiment analysis through Natural Language Processing. We measure strategy-related disclosures and their cross-sectional variation and classify report content into generic sections using synonym lists divided into four main categories according to their liquidity risk profile, risk positions, intra-annual information, and exposure to risk. We use Word Error Rate and Cosin Similarity for comparing and measuring text similarity and derivation in sets of texts. In addition to performing text extraction, we will provide a range of text analysis options, such as the readability metrics, word counts using pre-determined lists (e.g., forward-looking, uncertainty, tone, etc.), and comparison with reference corpus (word, parts of speech and semantic level). Therefore, we create an adequate analytical tool and a financial dictionary to depict the importance of granular financial disclosure for investors to identify correctly the risk-taking behavior and hence make the aggregated effects traceable.

Keywords: financial sentiment, machine learning, information disclosure, risk

Procedia PDF Downloads 94
27808 Sunspot Cycles: Illuminating Humanity's Mysteries

Authors: Aghamusa Azizov

Abstract:

This study investigates the correlation between solar activity and sentiment in news media coverage, using a large-scale dataset of solar activity since 1750 and over 15 million articles from "The New York Times" dating from 1851 onwards. Employing Pearson's correlation coefficient and multiple Natural Language Processing (NLP) tools—TextBlob, Vader, and DistillBERT—the research examines the extent to which fluctuations in solar phenomena are reflected in the sentiment of historical news narratives. The findings reveal that the correlation between solar activity and media sentiment is generally negligible, suggesting a weak influence of solar patterns on the portrayal of events in news media. Notably, a moderate positive correlation was observed between the sentiments derived from TextBlob and Vader, indicating consistency across NLP tools. The analysis provides insights into the historical impact of solar activity on human affairs and highlights the importance of using multiple analytical methods to understand complex relationships in large datasets. The study contributes to the broader understanding of how extraterrestrial factors may intersect with media-reported events and underlines the intricate nature of interdisciplinary research in the data science and historical domains.

Keywords: solar activity correlation, media sentiment analysis, natural language processing, historical event patterns

Procedia PDF Downloads 77
27807 Network and Sentiment Analysis of U.S. Congressional Tweets

Authors: Chaitanya Kanakamedala, Hansa Pradhan, Carter Gilbert

Abstract:

Social media platforms, such as Twitter, are excellent datasets for understanding human interactions and sentiments. This report explores social dynamics among US Congressional members through a network analysis applied to a dataset of tweets spanning 2008 to 2017 from the ’US Congressional Tweets Dataset’. In this report, we preform network analysis where connections between users (edges) are established based on a similarity threshold: two tweets are connected if the tweets they post are similar. By utilizing the Natural Language Toolkit (NLTK) and NetworkX, we quantified tweet similarity and constructed a graph comprising various interconnected components. Each component represents a cluster of users with closely aligned content. We then preform sentiment analysis on each cluster to explore the prevalent emotions and opinions within these groups. Our findings reveal that despite the initial expectation of distinct ideological divisions typically aligning with party lines, the analysis exposed a high degree of topical convergence across tweets from different political affiliations. The analysis preformed in this report not only highlights the potential of social media as a tool for political communication but also suggests a complex layer of interaction that transcends traditional partisan boundaries, reflecting a complicated landscape of politics in the digital age.

Keywords: natural language processing, sentiment analysis, centrality analysis, topic modeling

Procedia PDF Downloads 33
27806 Evidence of a Negativity Bias in the Keywords of Scientific Papers

Authors: Kseniia Zviagintseva, Brett Buttliere

Abstract:

Science is fundamentally a problem-solving enterprise, and scientists pay more attention to the negative things, that cause them dissonance and negative affective state of uncertainty or contradiction. While this is agreed upon by philosophers of science, there are few empirical demonstrations. Here we examine the keywords from those papers published by PLoS in 2014 and show with several sentiment analyzers that negative keywords are studied more than positive keywords. Our dataset is the 927,406 keywords of 32,870 scientific articles in all fields published in 2014 by the journal PLOS ONE (collected from Altmetric.com). Counting how often the 47,415 unique keywords are used, we can examine whether those negative topics are studied more than positive. In order to find the sentiment of the keywords, we utilized two sentiment analysis tools, Hu and Liu (2004) and SentiStrength (2014). The results below are for Hu and Liu as these are the less convincing results. The average keyword was utilized 19.56 times, with half of the keywords being utilized only 1 time and the maximum number of uses being 18,589 times. The keywords identified as negative were utilized 37.39 times, on average, with the positive keywords being utilized 14.72 times and the neutral keywords - 19.29, on average. This difference is only marginally significant, with an F value of 2.82, with a p of .05, but one must keep in mind that more than half of the keywords are utilized only 1 time, artificially increasing the variance and driving the effect size down. To examine more closely, we looked at those top 25 most utilized keywords that have a sentiment. Among the top 25, there are only two positive words, ‘care’ and ‘dynamics’, in position numbers 5 and 13 respectively, with all the rest being identified as negative. ‘Diseases’ is the most studied keyword with 8,790 uses, with ‘cancer’ and ‘infectious’ being the second and fourth most utilized sentiment-laden keywords. The sentiment analysis is not perfect though, as the words ‘diseases’ and ‘disease’ are split by taking 1st and 3rd positions. Combining them, they remain as the most common sentiment-laden keyword, being utilized 13,236 times. More than just splitting the words, the sentiment analyzer logs ‘regression’ and ‘rat’ as negative, and these should probably be considered false positives. Despite these potential problems, the effect is apparent, as even the positive keywords like ‘care’ could or should be considered negative, since this word is most commonly utilized as a part of ‘health care’, ‘critical care’ or ‘quality of care’ and generally associated with how to improve it. All in all, the results suggest that negative concepts are studied more, also providing support for the notion that science is most generally a problem-solving enterprise. The results also provide evidence that negativity and contradiction are related to greater productivity and positive outcomes.

Keywords: bibliometrics, keywords analysis, negativity bias, positive and negative words, scientific papers, scientometrics

Procedia PDF Downloads 186
27805 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: big data, social networks, sentiment analysis, twitter

Procedia PDF Downloads 576
27804 Beyond Text: Unveiling the Emotional Landscape in Academic Writing

Authors: Songyun Chen

Abstract:

Recent scholarly attention to sentiment analysis has provided researchers with a deeper understanding of how emotions are conveyed in writing and leveraged by academic authors as a persuasive tool. Using the National Research Council (NRC) Sentiment Lexicons (version 1.0) created by the National Research Council Canada, this study examined specific emotions in research articles (RAs) across four disciplines, including literature, education, biology, and computer & information science based on four datasets totaling over three million tokens, aiming to reveal how the emotions are conveyed by authors in academic writing. The results showed that four emotions—trust, anticipation, joy, and surprise—were observed in all four disciplines, while sadness emotion was spotted solely in literature. With the emotion of trust being overwhelmingly prominent, the rest emotions varied significantly across disciplines. The findings contribute to our understanding of emotion strategy applied in academic writing and genre characteristics of RAs.

Keywords: sentiment analysis, specific emotions, emotional landscape, research articles, academic writing

Procedia PDF Downloads 28
27803 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review

Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha

Abstract:

Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision-making has not been far-fetched. Proper classification of this textual information in a given context has also been very difficult. As a result, we decided to conduct a systematic review of previous literature on sentiment classification and AI-based techniques that have been used in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that can correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy by assessing different artificial intelligence techniques. We evaluated over 250 articles from digital sources like ScienceDirect, ACM, Google Scholar, and IEEE Xplore and whittled down the number of research to 31. Findings revealed that Deep learning approaches such as CNN, RNN, BERT, and LSTM outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also necessary for developing a robust sentiment classifier and can be obtained from places like Twitter, movie reviews, Kaggle, SST, and SemEval Task4. Hybrid Deep Learning techniques like CNN+LSTM, CNN+GRU, CNN+BERT outperformed single Deep Learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of sentiment analyzer development due to its simplicity and AI-based library functionalities. Based on some of the important findings from this study, we made a recommendation for future research.

Keywords: artificial intelligence, natural language processing, sentiment analysis, social network, text

Procedia PDF Downloads 115
27802 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor

Authors: Tayyaba Azim, Bibi Amina

Abstract:

The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.

Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec

Procedia PDF Downloads 148
27801 Sarcasm Recognition System Using Hybrid Tone-Word Spotting Audio Mining Technique

Authors: Sandhya Baskaran, Hari Kumar Nagabushanam

Abstract:

Sarcasm sentiment recognition is an area of natural language processing that is being probed into in the recent times. Even with the advancements in NLP, typical translations of words, sentences in its context fail to provide the exact information on a sentiment or emotion of a user. For example, if something bad happens, the statement ‘That's just what I need, great! Terrific!’ is expressed in a sarcastic tone which could be misread as a positive sign by any text-based analyzer. In this paper, we are presenting a unique real time ‘word with its tone’ spotting technique which would provide the sentiment analysis for a tone or pitch of a voice in combination with the words being expressed. This hybrid approach increases the probability for identification of special sentiment like sarcasm much closer to the real world than by mining text or speech individually. The system uses a tone analyzer such as YIN-FFT which extracts pitch segment-wise that would be used in parallel with a speech recognition system. The clustered data is classified for sentiments and sarcasm score for each of it determined. Our Simulations demonstrates the improvement in f-measure of around 12% compared to existing detection techniques with increased precision and recall.

Keywords: sarcasm recognition, tone-word spotting, natural language processing, pitch analyzer

Procedia PDF Downloads 293