Search results for: Tweets
73 Short Text Classification for Saudi Tweets
Authors: Asma A. Alsufyani, Maram A. Alharthi, Maha J. Althobaiti, Manal S. Alharthi, Huda Rizq
Abstract:
Twitter is one of the most popular microblogging sites that allows users to publish short text messages called 'tweets'. Increasing the number of accounts to follow (followings) increases the number of tweets that will be displayed from different topics in an unclassified manner in the timeline of the user. Therefore, it can be a vital solution for many Twitter users to have their tweets in a timeline classified into general categories to save the user’s time and to provide easy and quick access to tweets based on topics. In this paper, we developed a classifier for timeline tweets trained on a dataset consisting of 3600 tweets in total, which were collected from Saudi Twitter and annotated manually. We experimented with the well-known Bag-of-Words approach to text classification, and we used support vector machines (SVM) in the training process. The trained classifier performed well on a test dataset, with an average F1-measure equal to 92.3%. The classifier has been integrated into an application, which practically proved the classifier’s ability to classify timeline tweets of the user.Keywords: corpus creation, feature extraction, machine learning, short text classification, social media, support vector machine, Twitter
Procedia PDF Downloads 15472 An Enhanced Support Vector Machine Based Approach for Sentiment Classification of Arabic Tweets of Different Dialects
Authors: Gehad S. Kaseb, Mona F. Ahmed
Abstract:
Arabic Sentiment Analysis (SA) is one of the most common research fields with many open areas. Few studies apply SA to Arabic dialects. This paper proposes different pre-processing steps and a modified methodology to improve the accuracy using normal Support Vector Machine (SVM) classification. The paper works on two datasets, Arabic Sentiment Tweets Dataset (ASTD) and Extended Arabic Tweets Sentiment Dataset (Extended-AATSD), which are publicly available for academic use. The results show that the classification accuracy approaches 86%.Keywords: Arabic, classification, sentiment analysis, tweets
Procedia PDF Downloads 14771 The Paralinguistic Function of Emojis in Twitter Communication
Authors: Yasmin Tantawi, Mary Beth Rosson
Abstract:
In response to the dearth of information about emoji use for different purposes in different settings, this paper investigates the paralinguistic function of emojis within Twitter communication in the United States. To conduct this investigation, the Twitter feeds from 16 population centers spread throughout the United States were collected from the Twitter public API. One hundred tweets were collected from each population center, totaling to 1,600 tweets. Tweets containing emojis were next extracted using the “emot” Python package; these were then analyzed via the IBM Watson API Natural Language Understanding module to identify the topics discussed. A manual content analysis was then conducted to ascertain the paralinguistic and emotional features of the emojis used in these tweets. We present our characterization of emoji usage in Twitter and discuss implications for the design of Twitter and other text-based communication tools.Keywords: computer-mediated communication, content analysis, paralinguistics, sociology
Procedia PDF Downloads 16070 Emotions in Health Tweets: Analysis of American Government Official Accounts
Authors: García López
Abstract:
The Government Departments of Health have the task of informing and educating citizens about public health issues. For this, they use channels like Twitter, key in the search for health information and the propagation of content. The tweets, important in the virality of the content, may contain emotions that influence the contagion and exchange of knowledge. The goal of this study is to perform an analysis of the emotional projection of health information shared on Twitter by official American accounts: the disease control account CDCgov, National Institutes of Health, NIH, the government agency HHSGov, and the professional organization PublicHealth. For this, we used Tone Analyzer, an International Business Machines Corporation (IBM) tool specialized in emotion detection in text, corresponding to the categorical model of emotion representation. For 15 days, all tweets from these accounts were analyzed with the emotional analysis tool in text. The results showed that their tweets contain an important emotional load, a determining factor in the success of their communications. This exposes that official accounts also use subjective language and contain emotions. The predominance of emotion joy over sadness and the strong presence of emotions in their tweets stimulate the virality of content, a key in the work of informing that government health departments have.Keywords: emotions in tweets, emotion detection in the text, health information on Twitter, American health official accounts, emotions on Twitter, emotions and content
Procedia PDF Downloads 14269 Network and Sentiment Analysis of U.S. Congressional Tweets
Authors: Chaitanya Kanakamedala, Hansa Pradhan, Carter Gilbert
Abstract:
Social media platforms, such as Twitter, are excellent datasets for understanding human interactions and sentiments. This report explores social dynamics among US Congressional members through a network analysis applied to a dataset of tweets spanning 2008 to 2017 from the ’US Congressional Tweets Dataset’. In this report, we preform network analysis where connections between users (edges) are established based on a similarity threshold: two tweets are connected if the tweets they post are similar. By utilizing the Natural Language Toolkit (NLTK) and NetworkX, we quantified tweet similarity and constructed a graph comprising various interconnected components. Each component represents a cluster of users with closely aligned content. We then preform sentiment analysis on each cluster to explore the prevalent emotions and opinions within these groups. Our findings reveal that despite the initial expectation of distinct ideological divisions typically aligning with party lines, the analysis exposed a high degree of topical convergence across tweets from different political affiliations. The analysis preformed in this report not only highlights the potential of social media as a tool for political communication but also suggests a complex layer of interaction that transcends traditional partisan boundaries, reflecting a complicated landscape of politics in the digital age.Keywords: natural language processing, sentiment analysis, centrality analysis, topic modeling
Procedia PDF Downloads 3368 Chinese “Wolf Warrior” Diplomacy And Foreign Public Opinion
Authors: Chaohong Pan
Abstract:
Through public diplomacy on social media, governments have attempted to influence foreign public opinion. What is the impact of digital public diplomacy? Public diplomacy research often relies on content analysis to study the strategies employed by communicators but has rarely examined its actual impact on the audience. In addition, we do not know if giving a communicator an explicit label, as Twitter does with “government account”, would change the effects of the messages. Can the government label reduce the percussiveness of public diplomacy messages by sending a warning signal? Using a 2 × 2 survey experiment, the present paper contributes to the study of public diplomacy by randomly exposing American participants to four types of tweets from Chinese diplomats. The stimulus materials vary in terms of the tweets’ content (“positive-china” vs. “negative-US) and Twitter government labels (with vs. without the labels). I found that positive tweets about China have a significant positive effect on Americans’ attitudes toward China, whereas negative tweets about the US have little effect on their opinions. Furthermore, positive-China tweets are effective only on China-related issues, which indicates that Chinese diplomats’ tweets have limited effects on shaping a foreign audience’s attitudes toward their own country. Lastly, I find that labels largely have no impact on a diplomatic tweet’s effect. These results contribute to our understanding of the effects of public diplomacy in the digital age.Keywords: public diplomacy, china, foreign public opinion, twitter
Procedia PDF Downloads 19167 Social Media Mining with R. Twitter Analyses
Authors: Diana Codat
Abstract:
Tweets' analysis is part of text mining. Each document is a written text. It's possible to apply the usual text search techniques, in particular by switching to the bag-of-words representation. But the tweets induce peculiarities. Some may enrich the analysis. Thus, their length is calibrated (at least as far as public messages are concerned), special characters make it possible to identify authors (@) and themes (#), the tweet and retweet mechanisms make it possible to follow the diffusion of the information. Conversely, other characteristics may disrupt the analyzes. Because space is limited, authors often use abbreviations, emoticons to express feelings, and they do not pay much attention to spelling. All this creates noise that can complicate the task. The tweets carry a lot of potentially interesting information. Their exploitation is one of the main axes of the analysis of the social networks. We show how to access Twitter-related messages. We will initiate a study of the properties of the tweets, and we will follow up on the exploitation of the content of the messages. We will work under R with the package 'twitteR'. The study of tweets is a strong focus of analysis of social networks because Twitter has become an important vector of communication. This example shows that it is easy to initiate an analysis from data extracted directly online. The data preparation phase is of great importance.Keywords: data mining, language R, social networks, Twitter
Procedia PDF Downloads 18466 Syndromic Surveillance Framework Using Tweets Data Analytics
Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden
Abstract:
Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza
Procedia PDF Downloads 11565 An Investigation of Sentiment and Themes from Twitter for Brexit in 2016
Authors: Anas Alsuhaibani
Abstract:
Observing debate and discussion over social media has been found to be a promising tool to investigate different types of opinion. On 23 June 2016, Brexit voters in the UK decided to depart from the EU, with 51.9% voting to leave. On Twitter, there had been a massive debate in this context, and the hashtag Brexit was allocated as number six of the most tweeted hashtags across the globe in 2016. The study aimed to investigate the sentiment and themes expressed in a sample of tweets during a political event (Brexit) in 2016. A sentiment and thematic analysis was conducted on 1304 randomly selected tweets tagged with the hashtag Brexit in Twitter for the period from 10 June 2016 to 7 July 2016. The data were coded manually into two code frames, sentiment and thematic, and the reliability of coding was assessed for both codes. The sentiment analysis of the selected sample found that 45.63% of tweets conveyed negative emotions while there were only 10.43% conveyed positive emotions. It also surprisingly resulted that 29.37% were factual tweets, where the tweeter expressed no sentiment and the tweet conveyed a fact. For the thematic analysis, the economic theme dominated by 23.41%, and almost half of its discussion was related to business within the UK and the UK and global stock markets. The study reported that the current UK government and relation to campaign themes were the most negative themes. Both sentiment and thematic analyses found that tweets with more than one opinion or theme were rare, 8.29% and 6.13%, respectively.Keywords: Brexit, political opinion mining, social media, twitter
Procedia PDF Downloads 21364 Collision Theory Based Sentiment Detection Using Discourse Analysis in Hadoop
Authors: Anuta Mukherjee, Saswati Mukherjee
Abstract:
Data is growing everyday. Social networking sites such as Twitter are becoming an integral part of our daily lives, contributing a large increase in the growth of data. It is a rich source especially for sentiment detection or mining since people often express honest opinion through tweets. However, although sentiment analysis is a well-researched topic in text, this analysis using Twitter data poses additional challenges since these are unstructured data with abbreviations and without a strict grammatical correctness. We have employed collision theory to achieve sentiment analysis in Twitter data. We have also incorporated discourse analysis in the collision theory based model to detect accurate sentiment from tweets. We have also used the retweet field to assign weights to certain tweets and obtained the overall weightage of a topic provided in the form of a query. Hadoop has been exploited for speed. Our experiments show effective results.Keywords: sentiment analysis, twitter, collision theory, discourse analysis
Procedia PDF Downloads 53463 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor
Authors: Tayyaba Azim, Bibi Amina
Abstract:
The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec
Procedia PDF Downloads 14862 Composite Kernels for Public Emotion Recognition from Twitter
Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang
Abstract:
The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.Keywords: emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining
Procedia PDF Downloads 21761 Twitter Sentiment Analysis during the Lockdown on New-Zealand
Authors: Smah Almotiri
Abstract:
One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2020, until April 4, 2020. Natural language processing (NLP), which is a form of Artificial intelligence, was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applying machine learning sentimental methods such as Crystal Feel and extending the size of the sample tweet by using multiple tweets over a longer period of time.Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS
Procedia PDF Downloads 19060 The Fefe Indices: The Direction of Donal Trump’s Tweets Effect on the Stock Market
Authors: Sergio Andres Rojas, Julian Benavides Franco, Juan Tomas Sayago
Abstract:
An increasing amount of research demonstrates how market mood affects financial markets, but their primary goal is to demonstrate how Trump's tweets impacted US interest rate volatility. Following that lead, this work evaluates the effect that Trump's tweets had during his presidency on local and international stock markets, considering not just volatility but the direction of the movement. Three indexes for Trump's tweets were created relating his activity with movements in the S&P500 using natural language analysis and machine learning algorithms. The indexes consider Trump's tweet activity and the positive or negative market sentiment they might inspire. The first explores the relationship between tweets generating negative movements in the S&P500; the second explores positive movements, while the third explores the difference between up and down movements. A pseudo-investment strategy using the indexes produced statistically significant above-average abnormal returns. The findings also showed that the pseudo strategy generated a higher return in the local market if applied to intraday data. However, only a negative market sentiment caused this effect on daily data. These results suggest that the market reacted primarily to a negative idea reflected in the negative index. In the international market, it is not possible to identify a pervasive effect. A rolling window regression model was also performed. The result shows that the impact on the local and international markets is heterogeneous, time-changing, and differentiated for the market sentiment. However, the negative sentiment was more prone to have a significant correlation most of the time.Keywords: market sentiment, Twitter market sentiment, machine learning, natural dialect analysis
Procedia PDF Downloads 6259 Unravelling the Interplay: Chinese Government Tweets, Anti-US Propaganda Cartoons and Social Media Dynamics in US-China Relations
Authors: Mitchell Gallagher
Abstract:
This investigation explores the relationship between Chinese government ministers' tweets and publicized anti-US propaganda political cartoons by Chinese state media. Defining "anti-US" tweets as expressions with negative impressions about the United States, its policies, or cultural values, the study considers their context-dependent nature. Analyzing social media's growing role, this research probes the Chinese government's attitudes toward the United States. While China traditionally adhered to a non-interference stance, instances of verbal and visual retorts occurred, driven by efforts to enhance soft power and counter unfavorable portrayals. To navigate global challenges, China embraced proactive image construction, utilizing political cartoons as a messaging tool. As Sino-American political relations continue deteriorating, it has become increasingly commonplace for Chinese officials to circulate anti-US messages and negative impressions of the United States via tweets. The present study is committed to inspecting the nature and frequency of political cartoons casting the United States in an unfavorable light, with the aim of gaining a comprehensive understanding the degree to which the Chinese government and state-affiliated media are aligned in their corresponding messaging.Keywords: China, political cartoons, propaganda, twitter, social media
Procedia PDF Downloads 7158 Historical Hashtags: An Investigation of the #CometLanding Tweets
Authors: Noor Farizah Ibrahim, Christopher Durugbo
Abstract:
This study aims to investigate how the Twittersphere reacted during the recent historical event of robotic landing on a comet. The news is about Philae, a robotic lander from European Space Agency (ESA), which successfully made the first-ever rendezvous and touchdown of its kind on a nucleus comet on November 12, 2014. In order to understand how Twitter is practically used in spreading messages on historical events, we conducted an analysis of one-week tweet feeds that contain the #CometLanding hashtag. We studied the trends of tweets, the diffusion of the information and the characteristics of the social network created. The results indicated that the use of Twitter as a platform enables online communities to engage and spread the historical event through social media network (e.g. tweets, retweets, mentions and replies). In addition, it was found that comprehensible and understandable hashtags could influence users to follow the same tweet stream compared to other laborious hashtags which were difficult to understand by users in online communities.Keywords: diffusion of information, hashtag, social media, Twitter
Procedia PDF Downloads 32557 Semantic Network Analysis of the Saudi Women Driving Decree
Authors: Dania Aljouhi
Abstract:
September 26th, 2017, is a historic date for all women in Saudi Arabia. On that day, Saudi Arabia announced the decree on allowing Saudi women to drive. With the advent of vision 2030 and its goal to empower women and increase their participation in Saudi society, we see how Saudis’ Twitter users deliberate the 2017 decree from different social, cultural, religious, economic and political factors. This topic bridges social media 'Twitter,' gender and social-cultural studies to offer insights into how Saudis’ tweets reflect a broader discourse on Saudi women in the age of social media. The present study aims to explore the meanings and themes that emerge by Saudis’ Twitter users in response to the 2017 royal decree on women driving. The sample used in the current study involves (n= 1000) tweets that were collected from Sep 2017 to March 2019 to account for the Saudis’ tweets before and after implementing the decree. The paper uses semantic and thematic network analysis methods to examine the Saudis’ Twitter discourse on the women driving issue. The paper argues that Twitter as a platform has mediated the discourse of women driving among the Saudi community and facilitated social changes. Finally, framing theory (Goffman, 1974) and Networked framing (Meraz & Papacharissi 2013) are both used to explain the tweets on the decree of allowing Saudi women to drive based on # Saudi women-driving-cars.Keywords: Saudi Arabia, women, Twitter, semantic network analysis, framing
Procedia PDF Downloads 15456 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis
Authors: Elcin Timur Cakmak, Ayse Oguzlar
Abstract:
This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.Keywords: classification algorithms, machine learning, sentiment analysis, Twitter
Procedia PDF Downloads 7355 Exploring Tweeters’ Concerns and Opinions about FIFA Arab Cup 2021: An Investigation Study
Authors: Md. Rafiul Biswas, Uzair Shah, Mohammad Alkayal, Zubair Shah, Othman Althawadi, Kamila Swart
Abstract:
Background: Social media platforms play a significant role in the mediated consumption of sport, especially so for sport mega-event. The characteristics of Twitter data (e.g., user mentions, retweets, likes, #hashtag) accumulate the users in one ground and spread information widely and quickly. Analysis of Twitter data can reflect the public attitudes, behavior, and sentiment toward a specific event on a larger scale than traditional surveys. Qatar is going to be the first Arab country to host the mega sports event FIFA World Cup 2022 (Q22). Qatar has hosted the FIFA Arab Cup 2021 (FAC21) to serve as a preparation for the mega-event. Objectives: This study investigates public sentiments and experiences about FAC21 and provides an insight to enhance the public experiences for the upcoming Q22. Method: FCA21-related tweets were downloaded using Twitter Academic research API between 01 October 2021 to 18 February 2022. Tweets were divided into three different periods: before T1 (01 Oct 2021 to 29 Nov 2021), during T2 (30 Nov 2021 -18 Dec 2021), and after the FAC21 T3 (19 Dec 2021-18 Feb 2022). The collected tweets were preprocessed in several steps to prepare for analysis; (1) removed duplicate and retweets, (2) removed emojis, punctuation, and stop words (3) normalized tweets using word lemmatization. Then, rule-based classification was applied to remove irrelevant tweets. Next, the twitter-XLM-roBERTa-base model from Huggingface was applied to identify the sentiment in the tweets. Further, state-of-the-art BertTopic modeling will be applied to identify trending topics over different periods. Results: We downloaded 8,669,875 Tweets posted by 2728220 unique users in different languages. Of those, 819,813 unique English tweets were selected in this study. After splitting into three periods, 541630, 138876, and 139307 were from T1, T2, and T3, respectively. Most of the sentiments were neutral, around 60% in different periods. However, the rate of negative sentiment (23%) was high compared to positive sentiment (18%). The analysis indicates negative concerns about FAC21. Therefore, we will apply BerTopic to identify public concerns. This study will permit the investigation of people’s expectations before FAC21 (e.g., stadium, transportation, accommodation, visa, tickets, travel, and other facilities) and ascertain whether these were met. Moreover, it will highlight public expectations and concerns. The findings of this study can assist the event organizers in enhancing implementation plans for Q22. Furthermore, this study can support policymakers with aligning strategies and plans to leverage outstanding outcomes.Keywords: FIFA Arab Cup, FIFA, Twitter, machine learning
Procedia PDF Downloads 10054 StockTwits Sentiment Analysis on Stock Price Prediction
Authors: Min Chen, Rubi Gupta
Abstract:
Understanding and predicting stock market movements is a challenging problem. It is believed stock markets are partially driven by public sentiments, which leads to numerous research efforts to predict stock market trend using public sentiments expressed on social media such as Twitter but with limited success. Recently a microblogging website StockTwits is becoming increasingly popular for users to share their discussions and sentiments about stocks and financial market. In this project, we analyze the text content of StockTwits tweets and extract financial sentiment using text featurization and machine learning algorithms. StockTwits tweets are first pre-processed using techniques including stopword removal, special character removal, and case normalization to remove noise. Features are extracted from these preprocessed tweets through text featurization process using bags of words, N-gram models, TF-IDF (term frequency-inverse document frequency), and latent semantic analysis. Machine learning models are then trained to classify the tweets' sentiment as positive (bullish) or negative (bearish). The correlation between the aggregated daily sentiment and daily stock price movement is then investigated using Pearson’s correlation coefficient. Finally, the sentiment information is applied together with time series stock data to predict stock price movement. The experiments on five companies (Apple, Amazon, General Electric, Microsoft, and Target) in a duration of nine months demonstrate the effectiveness of our study in improving the prediction accuracy.Keywords: machine learning, sentiment analysis, stock price prediction, tweet processing
Procedia PDF Downloads 15653 The Usage of Negative Emotive Words in Twitter
Authors: Martina Katalin Szabó, István Üveges
Abstract:
In this paper, the usage of negative emotive words is examined on the basis of a large Hungarian twitter-database via NLP methods. The data is analysed from a gender point of view, as well as changes in language usage over time. The term negative emotive word refers to those words that, on their own, without context, have semantic content that can be associated with negative emotion, but in particular cases, they may function as intensifiers (e.g. rohadt jó ’damn good’) or a sentiment expression with positive polarity despite their negative prior polarity (e.g. brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’. Based on the findings of several authors, the same phenomenon can be found in other languages, so it is probably a language-independent feature. For the recent analysis, 67783 tweets were collected: 37818 tweets (19580 tweets written by females and 18238 tweets written by males) in 2016 and 48344 (18379 tweets written by females and 29965 tweets written by males) in 2021. The goal of the research was to make up two datasets comparable from the viewpoint of semantic changes, as well as from gender specificities. An exhaustive lexicon of Hungarian negative emotive intensifiers was also compiled (containing 214 words). After basic preprocessing steps, tweets were processed by ‘magyarlanc’, a toolkit is written in JAVA for the linguistic processing of Hungarian texts. Then, the frequency and collocation features of all these words in our corpus were automatically analyzed (via the analysis of parts-of-speech and sentiment values of the co-occurring words). Finally, the results of all four subcorpora were compared. Here some of the main outcomes of our analyses are provided: There are almost four times fewer cases in the male corpus compared to the female corpus when the negative emotive intensifier modified a negative polarity word in the tweet (e.g., damn bad). At the same time, male authors used these intensifiers more frequently, modifying a positive polarity or a neutral word (e.g., damn good and damn big). Results also pointed out that, in contrast to female authors, male authors used these words much more frequently as a positive polarity word as well (e.g., brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’). We also observed that male authors use significantly fewer types of emotive intensifiers than female authors, and the frequency proportion of the words is more balanced in the female corpus. As for changes in language usage over time, some notable differences in the frequency and collocation features of the words examined were identified: some of the words collocate with more positive words in the 2nd subcorpora than in the 1st, which points to the semantic change of these words over time.Keywords: gender differences, negative emotive words, semantic changes over time, twitter
Procedia PDF Downloads 20552 Bidirectional Encoder Representations from Transformers Sentiment Analysis Applied to Three Presidential Pre-Candidates in Costa Rica
Authors: Félix David Suárez Bonilla
Abstract:
A sentiment analysis service to detect polarity (positive, neural, and negative), based on transfer learning, was built using a Spanish version of BERT and applied to tweets written in Spanish. The dataset that was used consisted of 11975 reviews, which were extracted from Google Play using the google-play-scrapper package. The BETO trained model used: the AdamW optimizer, a batch size of 16, a learning rate of 2x10⁻⁵ and 10 epochs. The system was tested using tweets of three presidential pre-candidates from Costa Rica. The system was finally validated using human labeled examples, achieving an accuracy of 83.3%.Keywords: NLP, transfer learning, BERT, sentiment analysis, social media, opinion mining
Procedia PDF Downloads 17351 Topic Sentiments toward the COVID-19 Vaccine on Twitter
Authors: Melissa Vang, Raheyma Khan, Haihua Chen
Abstract:
The coronavirus disease 2019 (COVID‐19) pandemic has changed people's lives from all over the world. More people have turned to Twitter to engage online and discuss the COVID-19 vaccine. This study aims to present a text mining approach to identify people's attitudes towards the COVID-19 vaccine on Twitter. To achieve this purpose, we collected 54,268 COVID-19 vaccine tweets from September 01, 2020, to November 01, 2020, then the BERT model is used for the sentiment and topic analysis. The results show that people had more negative than positive attitudes about the vaccine, and countries with an increasing number of confirmed cases had a higher percentage of negative attitudes. Additionally, the topics discussed in positive and negative tweets are different. The tweet datasets can be helpful to information professionals to inform the public about vaccine-related informational resources. Our findings may have implications for understanding people's cognitions and feelings about the vaccine.Keywords: BERT, COVID-19 vaccine, sentiment analysis, topic modeling
Procedia PDF Downloads 14950 Tracing Digital Traces of Phatic Communion in #Mooc
Authors: Judith Enriquez-Gibson
Abstract:
This paper meddles with the notion of phatic communion introduced 90 years ago by Malinowski, who was a Polish-born British anthropologist. It explores the phatic in Twitter within the contents of tweets related to moocs (massive online open courses) as a topic or trend. It is not about moocs though. It is about practices that could easily be hidden or neglected if we let big or massive topics take the lead or if we simply follow the computational or secret codes behind Twitter itself and third party software analytics. It draws from media and cultural studies. Though at first it appears data-driven as I submitted data collection and analytics into the hands of a third party software, Twitonomy, the aim is to follow how phatic communion might be practised in a social media site, such as Twitter. Lurking becomes its research method to analyse mooc-related tweets. A total of 3,000 tweets were collected on 11 October 2013 (UK timezone). The emphasis of lurking is to engage with Twitter as a system of connectivity. One interesting finding is that a click is in fact a phatic practice. A click breaks the silence. A click in one of the mooc website is actually a tweet. A tweet was posted on behalf of a user who simply chose to click without formulating the text and perhaps without knowing that it contains #mooc. Surely, this mechanism is not about reciprocity. To break the silence, users did not use words. They just clicked the ‘tweet button’ on a mooc website. A click performs and maintains connectivity – and Twitter as the medium in attendance in our everyday, available when needed to be of service. In conclusion, the phatic culture of breaking silence in Twitter does not have to submit to the power of code and analytics. It is a matter of human code.Keywords: click, Twitter, phatic communion, social media data, mooc
Procedia PDF Downloads 41149 Analysing Social Media Coverage of Political Speeches in Relation to Discourse and Context
Authors: Yaser Mohammed Altameemi
Abstract:
This research looks at the representation of the social media for the Saudi Government decrees regarding the developmental projects of the Saudi 2030 vision. The paper analyses a television interview with the Crown Prince Mohammed Bin Salman who talks about the progress of the Saudi vision of 2030, and how the government had acted as response to the COVID-19 pandemic. The interview was on 28/4/2021. The paper analyses the tweets on Twitter that cover the interview for the purpose of investigating the development of concepts and meanings regarding the Saudi peoples’ orientations towards the Saudi projects. The data include all related tweets from the day of the interview and the following seven days after the interview. The finding of the collocation analysis suggests that nationalism notion is explicitly expressed by users in Twitter. The main finding of this paper suggests the importance of further analyses for the concordance lines. However, the collocation network suggests that there is a clear highlight for nationalism.Keywords: social media, twitter, political interview, prince Mohammed Bin Salman, Saudi vision 2030
Procedia PDF Downloads 19048 Sentiment Mapping through Social Media and Its Implications
Authors: G. C. Joshi, M. Paul, B. K. Kalita, V. Ranga, J. S. Rawat, P. S. Rawat
Abstract:
Being a habitat of the global village, every place has established connection through the strength and power of social media piercing through the political boundaries. Social media is a digital platform, where people across the world can interact as it has advantages of being universal, anonymous, easily accessible, indirect interaction, gathering and sharing information. The power of social media lies in the intensity of sharing extreme opinions or feelings, in contrast to the personal interactions which can be easily mapped in the form of Sentiment Mapping. The easy access to social networking sites such as Facebook, Twitter and blogs made unprecedented opportunities for citizens to voice their opinions loaded with dynamics of emotions. These further influence human thoughts where social media plays a very active role. A recent incident of public importance was selected as a case study to map the sentiments of people through Twitter. Understanding those dynamics through the eye of an ordinary people can be challenging. With the help of R-programming language and by the aid of GIS techniques sentiment maps has been produced. The emotions flowing worldwide in the form of tweets were extracted and analyzed. The number of tweets had diminished by 91 % from 25/08/2017 to 31/08/2017. A boom of sentiments emerged near the origin of the case, i.e., Delhi, Haryana and Punjab and the capital showed maximum influence resulting in spillover effect near Delhi. The trend of sentiments was prevailing more as neutral (45.37%), negative (28.6%) and positive (21.6%) after calculating the sentiment scores of the tweets. The result can be used to know the spatial distribution of digital penetration in India, where highest concentration lies in Mumbai and lowest in North East India and Jammu and Kashmir.Keywords: sentiment mapping, digital literacy, GIS, R statistical language, spatio-temporal
Procedia PDF Downloads 15147 Factors Promoting French-English Tweets in France
Authors: Taoues Hadour
Abstract:
Twitter has become a popular means of communication used in a variety of fields, such as politics, journalism, and academia. This widely used online platform has an impact on the way people express themselves and is changing language usage worldwide at an unprecedented pace. The language used online reflects the linguistic battle that has been going on for several decades in French society. This study enables a deeper understanding of users' linguistic behavior online. The implications are important and allow for a rise in awareness of intercultural and cross-language exchanges. This project investigates the mixing of French-English language usage among French users of Twitter using a topic analysis approach. This analysis draws on Gumperz's theory of conversational switching. In order to collect tweets at a large scale, the data was collected in R using the rtweet package to access and retrieve French tweets data through Twitter’s REST and stream APIs (Application Program Interface) using the software RStudio, the integrated development environment for R. The dataset was filtered manually and certain repetitions of themes were observed. A total of nine topic categories were identified and analyzed in this study: entertainment, internet/social media, events/community, politics/news, sports, sex/pornography, innovation/technology, fashion/make up, and business. The study reveals that entertainment is the most frequent topic discussed on Twitter. Entertainment includes movies, music, games, and books. Anglicisms such as trailer, spoil, and live are identified in the data. Change in language usage is inevitable and is a natural result of linguistic interactions. The use of different languages online is just an example of what the real world would look like without linguistic regulations. Social media reveals a multicultural and multilinguistic richness which can deepen and expand our understanding of contemporary human attitudes.Keywords: code-switching, French, sociolinguistics, Twitter
Procedia PDF Downloads 13746 Impact of Hashtags in Tweets Regarding COVID-19 on the Psyche of Pakistanis: A Critical Discourse Analytical Study
Authors: Muhammad Hamza
Abstract:
This study attempts to analyze the social media reports regarding Covid-19 that impacted the psyche of Pakistanis. This Study is delimited to hashtags from Tweets on a social media platform. During Covid-19, it has been observed that it affected the psychological conditions of Pakistanis. With the application of the three-dimensional model presented by Fairclough, together with a data analytic software “FireAnt” i.e., social media and data analysis toolkit, which is used to filter, identify, report and export data from social media accurately. A detailed and explicit exploration of the various hashtags by users from different fields was conducted. This study conducted a quantitative as well as qualitative methods of analysis. The study examined the perspectives of the Pakistanis behind the use of various hashtags with the lenses of Critical Discourse Analysis (CDA). While conducting this research, CDA was helpful to reveal the connection between the psyche of the people and the Covid-19 pandemic. It was found that how different Pakistanis used social media and how Covid-19 impacted their psyche. After collecting and analyzing the hashtags from twitter it was concluded that majority of people received negative impact from social media reports, while, some people used their hashtags positively and were found positive during Covid-19, and some people were found neutral.Keywords: Covid, Covid-19, psyche, Covid Pakistan
Procedia PDF Downloads 5845 An Approach for Pattern Recognition and Prediction of Information Diffusion Model on Twitter
Authors: Amartya Hatua, Trung Nguyen, Andrew Sung
Abstract:
In this paper, we study the information diffusion process on Twitter as a multivariate time series problem. Our model concerns three measures (volume, network influence, and sentiment of tweets) based on 10 features, and we collected 27 million tweets to build our information diffusion time series dataset for analysis. Then, different time series clustering techniques with Dynamic Time Warping (DTW) distance were used to identify different patterns of information diffusion. Finally, we built the information diffusion prediction models for new hashtags which comprise two phrases: The first phrase is recognizing the pattern using k-NN with DTW distance; the second phrase is building the forecasting model using the traditional Autoregressive Integrated Moving Average (ARIMA) model and the non-linear recurrent neural network of Long Short-Term Memory (LSTM). Preliminary results of performance evaluation between different forecasting models show that LSTM with clustering information notably outperforms other models. Therefore, our approach can be applied in real-world applications to analyze and predict the information diffusion characteristics of selected topics or memes (hashtags) in Twitter.Keywords: ARIMA, DTW, information diffusion, LSTM, RNN, time series clustering, time series forecasting, Twitter
Procedia PDF Downloads 39044 Exploring Public Opinions Toward the Use of Generative Artificial Intelligence Chatbot in Higher Education: An Insight from Topic Modelling and Sentiment Analysis
Authors: Samer Muthana Sarsam, Abdul Samad Shibghatullah, Chit Su Mon, Abd Aziz Alias, Hosam Al-Samarraie
Abstract:
Generative Artificial Intelligence chatbots (GAI chatbots) have emerged as promising tools in various domains, including higher education. However, their specific role within the educational context and the level of legal support for their implementation remain unclear. Therefore, this study aims to investigate the role of Bard, a newly developed GAI chatbot, in higher education. To achieve this objective, English tweets were collected from Twitter's free streaming Application Programming Interface (API). The Latent Dirichlet Allocation (LDA) algorithm was applied to extract latent topics from the collected tweets. User sentiments, including disgust, surprise, sadness, anger, fear, joy, anticipation, and trust, as well as positive and negative sentiments, were extracted using the NRC Affect Intensity Lexicon and SentiStrength tools. This study explored the benefits, challenges, and future implications of integrating GAI chatbots in higher education. The findings shed light on the potential power of such tools, exemplified by Bard, in enhancing the learning process and providing support to students throughout their educational journey.Keywords: generative artificial intelligence chatbots, bard, higher education, topic modelling, sentiment analysis
Procedia PDF Downloads 83