Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 53

Search results for: tweets

53 Short Text Classification for Saudi Tweets

Authors: Asma A. Alsufyani, Maram A. Alharthi, Maha J. Althobaiti, Manal S. Alharthi, Huda Rizq


Twitter is one of the most popular microblogging sites that allows users to publish short text messages called 'tweets'. Increasing the number of accounts to follow (followings) increases the number of tweets that will be displayed from different topics in an unclassified manner in the timeline of the user. Therefore, it can be a vital solution for many Twitter users to have their tweets in a timeline classified into general categories to save the user’s time and to provide easy and quick access to tweets based on topics. In this paper, we developed a classifier for timeline tweets trained on a dataset consisting of 3600 tweets in total, which were collected from Saudi Twitter and annotated manually. We experimented with the well-known Bag-of-Words approach to text classification, and we used support vector machines (SVM) in the training process. The trained classifier performed well on a test dataset, with an average F1-measure equal to 92.3%. The classifier has been integrated into an application, which practically proved the classifier’s ability to classify timeline tweets of the user.

Keywords: corpus creation, feature extraction, machine learning, short text classification, social media, support vector machine, Twitter

Procedia PDF Downloads 47
52 An Enhanced Support Vector Machine Based Approach for Sentiment Classification of Arabic Tweets of Different Dialects

Authors: Gehad S. Kaseb, Mona F. Ahmed


Arabic Sentiment Analysis (SA) is one of the most common research fields with many open areas. Few studies apply SA to Arabic dialects. This paper proposes different pre-processing steps and a modified methodology to improve the accuracy using normal Support Vector Machine (SVM) classification. The paper works on two datasets, Arabic Sentiment Tweets Dataset (ASTD) and Extended Arabic Tweets Sentiment Dataset (Extended-AATSD), which are publicly available for academic use. The results show that the classification accuracy approaches 86%.

Keywords: Arabic, classification, sentiment analysis, tweets

Procedia PDF Downloads 56
51 The Paralinguistic Function of Emojis in Twitter Communication

Authors: Yasmin Tantawi, Mary Beth Rosson


In response to the dearth of information about emoji use for different purposes in different settings, this paper investigates the paralinguistic function of emojis within Twitter communication in the United States. To conduct this investigation, the Twitter feeds from 16 population centers spread throughout the United States were collected from the Twitter public API. One hundred tweets were collected from each population center, totaling to 1,600 tweets. Tweets containing emojis were next extracted using the “emot” Python package; these were then analyzed via the IBM Watson API Natural Language Understanding module to identify the topics discussed. A manual content analysis was then conducted to ascertain the paralinguistic and emotional features of the emojis used in these tweets. We present our characterization of emoji usage in Twitter and discuss implications for the design of Twitter and other text-based communication tools.

Keywords: computer-mediated communication, content analysis, paralinguistics, sociology

Procedia PDF Downloads 95
50 Emotions in Health Tweets: Analysis of American Government Official Accounts

Authors: García López


The Government Departments of Health have the task of informing and educating citizens about public health issues. For this, they use channels like Twitter, key in the search for health information and the propagation of content. The tweets, important in the virality of the content, may contain emotions that influence the contagion and exchange of knowledge. The goal of this study is to perform an analysis of the emotional projection of health information shared on Twitter by official American accounts: the disease control account CDCgov, National Institutes of Health, NIH, the government agency HHSGov, and the professional organization PublicHealth. For this, we used Tone Analyzer, an International Business Machines Corporation (IBM) tool specialized in emotion detection in text, corresponding to the categorical model of emotion representation. For 15 days, all tweets from these accounts were analyzed with the emotional analysis tool in text. The results showed that their tweets contain an important emotional load, a determining factor in the success of their communications. This exposes that official accounts also use subjective language and contain emotions. The predominance of emotion joy over sadness and the strong presence of emotions in their tweets stimulate the virality of content, a key in the work of informing that government health departments have.

Keywords: emotions in tweets, emotion detection in the text, health information on Twitter, American health official accounts, emotions on Twitter, emotions and content

Procedia PDF Downloads 65
49 Chinese “Wolf Warrior” Diplomacy And Foreign Public Opinion

Authors: Chaohong Pan


Through public diplomacy on social media, governments have attempted to influence foreign public opinion. What is the impact of digital public diplomacy? Public diplomacy research often relies on content analysis to study the strategies employed by communicators but has rarely examined its actual impact on the audience. In addition, we do not know if giving a communicator an explicit label, as Twitter does with “government account”, would change the effects of the messages. Can the government label reduce the percussiveness of public diplomacy messages by sending a warning signal? Using a 2 × 2 survey experiment, the present paper contributes to the study of public diplomacy by randomly exposing American participants to four types of tweets from Chinese diplomats. The stimulus materials vary in terms of the tweets’ content (“positive-china” vs. “negative-US) and Twitter government labels (with vs. without the labels). I found that positive tweets about China have a significant positive effect on Americans’ attitudes toward China, whereas negative tweets about the US have little effect on their opinions. Furthermore, positive-China tweets are effective only on China-related issues, which indicates that Chinese diplomats’ tweets have limited effects on shaping a foreign audience’s attitudes toward their own country. Lastly, I find that labels largely have no impact on a diplomatic tweet’s effect. These results contribute to our understanding of the effects of public diplomacy in the digital age.

Keywords: public diplomacy, china, foreign public opinion, twitter

Procedia PDF Downloads 67
48 Social Media Mining with R. Twitter Analyses

Authors: Diana Codat


Tweets' analysis is part of text mining. Each document is a written text. It's possible to apply the usual text search techniques, in particular by switching to the bag-of-words representation. But the tweets induce peculiarities. Some may enrich the analysis. Thus, their length is calibrated (at least as far as public messages are concerned), special characters make it possible to identify authors (@) and themes (#), the tweet and retweet mechanisms make it possible to follow the diffusion of the information. Conversely, other characteristics may disrupt the analyzes. Because space is limited, authors often use abbreviations, emoticons to express feelings, and they do not pay much attention to spelling. All this creates noise that can complicate the task. The tweets carry a lot of potentially interesting information. Their exploitation is one of the main axes of the analysis of the social networks. We show how to access Twitter-related messages. We will initiate a study of the properties of the tweets, and we will follow up on the exploitation of the content of the messages. We will work under R with the package 'twitteR'. The study of tweets is a strong focus of analysis of social networks because Twitter has become an important vector of communication. This example shows that it is easy to initiate an analysis from data extracted directly online. The data preparation phase is of great importance.

Keywords: data mining, language R, social networks, Twitter

Procedia PDF Downloads 95
47 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden


Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 45
46 An Investigation of Sentiment and Themes from Twitter for Brexit in 2016

Authors: Anas Alsuhaibani


Observing debate and discussion over social media has been found to be a promising tool to investigate different types of opinion. On 23 June 2016, Brexit voters in the UK decided to depart from the EU, with 51.9% voting to leave. On Twitter, there had been a massive debate in this context, and the hashtag Brexit was allocated as number six of the most tweeted hashtags across the globe in 2016. The study aimed to investigate the sentiment and themes expressed in a sample of tweets during a political event (Brexit) in 2016. A sentiment and thematic analysis was conducted on 1304 randomly selected tweets tagged with the hashtag Brexit in Twitter for the period from 10 June 2016 to 7 July 2016. The data were coded manually into two code frames, sentiment and thematic, and the reliability of coding was assessed for both codes. The sentiment analysis of the selected sample found that 45.63% of tweets conveyed negative emotions while there were only 10.43% conveyed positive emotions. It also surprisingly resulted that 29.37% were factual tweets, where the tweeter expressed no sentiment and the tweet conveyed a fact. For the thematic analysis, the economic theme dominated by 23.41%, and almost half of its discussion was related to business within the UK and the UK and global stock markets. The study reported that the current UK government and relation to campaign themes were the most negative themes. Both sentiment and thematic analyses found that tweets with more than one opinion or theme were rare, 8.29% and 6.13%, respectively.

Keywords: Brexit, political opinion mining, social media, twitter

Procedia PDF Downloads 80
45 Collision Theory Based Sentiment Detection Using Discourse Analysis in Hadoop

Authors: Anuta Mukherjee, Saswati Mukherjee


Data is growing everyday. Social networking sites such as Twitter are becoming an integral part of our daily lives, contributing a large increase in the growth of data. It is a rich source especially for sentiment detection or mining since people often express honest opinion through tweets. However, although sentiment analysis is a well-researched topic in text, this analysis using Twitter data poses additional challenges since these are unstructured data with abbreviations and without a strict grammatical correctness. We have employed collision theory to achieve sentiment analysis in Twitter data. We have also incorporated discourse analysis in the collision theory based model to detect accurate sentiment from tweets. We have also used the retweet field to assign weights to certain tweets and obtained the overall weightage of a topic provided in the form of a query. Hadoop has been exploited for speed. Our experiments show effective results.

Keywords: sentiment analysis, twitter, collision theory, discourse analysis

Procedia PDF Downloads 411
44 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor

Authors: Tayyaba Azim, Bibi Amina


The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.

Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec

Procedia PDF Downloads 69
43 Composite Kernels for Public Emotion Recognition from Twitter

Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang


The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.

Keywords: emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining

Procedia PDF Downloads 88
42 Twitter Sentiment Analysis during the Lockdown on New-Zealand

Authors: Smah Almotiri


One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2020, until April 4, 2020. Natural language processing (NLP), which is a form of Artificial intelligence, was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applying machine learning sentimental methods such as Crystal Feel and extending the size of the sample tweet by using multiple tweets over a longer period of time.

Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS

Procedia PDF Downloads 75
41 Historical Hashtags: An Investigation of the #CometLanding Tweets

Authors: Noor Farizah Ibrahim, Christopher Durugbo


This study aims to investigate how the Twittersphere reacted during the recent historical event of robotic landing on a comet. The news is about Philae, a robotic lander from European Space Agency (ESA), which successfully made the first-ever rendezvous and touchdown of its kind on a nucleus comet on November 12, 2014. In order to understand how Twitter is practically used in spreading messages on historical events, we conducted an analysis of one-week tweet feeds that contain the #CometLanding hashtag. We studied the trends of tweets, the diffusion of the information and the characteristics of the social network created. The results indicated that the use of Twitter as a platform enables online communities to engage and spread the historical event through social media network (e.g. tweets, retweets, mentions and replies). In addition, it was found that comprehensible and understandable hashtags could influence users to follow the same tweet stream compared to other laborious hashtags which were difficult to understand by users in online communities.

Keywords: diffusion of information, hashtag, social media, Twitter

Procedia PDF Downloads 250
40 Semantic Network Analysis of the Saudi Women Driving Decree

Authors: Dania Aljouhi


September 26th, 2017, is a historic date for all women in Saudi Arabia. On that day, Saudi Arabia announced the decree on allowing Saudi women to drive. With the advent of vision 2030 and its goal to empower women and increase their participation in Saudi society, we see how Saudis’ Twitter users deliberate the 2017 decree from different social, cultural, religious, economic and political factors. This topic bridges social media 'Twitter,' gender and social-cultural studies to offer insights into how Saudis’ tweets reflect a broader discourse on Saudi women in the age of social media. The present study aims to explore the meanings and themes that emerge by Saudis’ Twitter users in response to the 2017 royal decree on women driving. The sample used in the current study involves (n= 1000) tweets that were collected from Sep 2017 to March 2019 to account for the Saudis’ tweets before and after implementing the decree. The paper uses semantic and thematic network analysis methods to examine the Saudis’ Twitter discourse on the women driving issue. The paper argues that Twitter as a platform has mediated the discourse of women driving among the Saudi community and facilitated social changes. Finally, framing theory (Goffman, 1974) and Networked framing (Meraz & Papacharissi 2013) are both used to explain the tweets on the decree of allowing Saudi women to drive based on # Saudi women-driving-cars.

Keywords: Saudi Arabia, women, Twitter, semantic network analysis, framing

Procedia PDF Downloads 47
39 StockTwits Sentiment Analysis on Stock Price Prediction

Authors: Min Chen, Rubi Gupta


Understanding and predicting stock market movements is a challenging problem. It is believed stock markets are partially driven by public sentiments, which leads to numerous research efforts to predict stock market trend using public sentiments expressed on social media such as Twitter but with limited success. Recently a microblogging website StockTwits is becoming increasingly popular for users to share their discussions and sentiments about stocks and financial market. In this project, we analyze the text content of StockTwits tweets and extract financial sentiment using text featurization and machine learning algorithms. StockTwits tweets are first pre-processed using techniques including stopword removal, special character removal, and case normalization to remove noise. Features are extracted from these preprocessed tweets through text featurization process using bags of words, N-gram models, TF-IDF (term frequency-inverse document frequency), and latent semantic analysis. Machine learning models are then trained to classify the tweets' sentiment as positive (bullish) or negative (bearish). The correlation between the aggregated daily sentiment and daily stock price movement is then investigated using Pearson’s correlation coefficient. Finally, the sentiment information is applied together with time series stock data to predict stock price movement. The experiments on five companies (Apple, Amazon, General Electric, Microsoft, and Target) in a duration of nine months demonstrate the effectiveness of our study in improving the prediction accuracy.

Keywords: machine learning, sentiment analysis, stock price prediction, tweet processing

Procedia PDF Downloads 57
38 The Usage of Negative Emotive Words in Twitter

Authors: Martina Katalin Szabó, István Üveges


In this paper, the usage of negative emotive words is examined on the basis of a large Hungarian twitter-database via NLP methods. The data is analysed from a gender point of view, as well as changes in language usage over time. The term negative emotive word refers to those words that, on their own, without context, have semantic content that can be associated with negative emotion, but in particular cases, they may function as intensifiers (e.g. rohadt jó ’damn good’) or a sentiment expression with positive polarity despite their negative prior polarity (e.g. brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’. Based on the findings of several authors, the same phenomenon can be found in other languages, so it is probably a language-independent feature. For the recent analysis, 67783 tweets were collected: 37818 tweets (19580 tweets written by females and 18238 tweets written by males) in 2016 and 48344 (18379 tweets written by females and 29965 tweets written by males) in 2021. The goal of the research was to make up two datasets comparable from the viewpoint of semantic changes, as well as from gender specificities. An exhaustive lexicon of Hungarian negative emotive intensifiers was also compiled (containing 214 words). After basic preprocessing steps, tweets were processed by ‘magyarlanc’, a toolkit is written in JAVA for the linguistic processing of Hungarian texts. Then, the frequency and collocation features of all these words in our corpus were automatically analyzed (via the analysis of parts-of-speech and sentiment values of the co-occurring words). Finally, the results of all four subcorpora were compared. Here some of the main outcomes of our analyses are provided: There are almost four times fewer cases in the male corpus compared to the female corpus when the negative emotive intensifier modified a negative polarity word in the tweet (e.g., damn bad). At the same time, male authors used these intensifiers more frequently, modifying a positive polarity or a neutral word (e.g., damn good and damn big). Results also pointed out that, in contrast to female authors, male authors used these words much more frequently as a positive polarity word as well (e.g., brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’). We also observed that male authors use significantly fewer types of emotive intensifiers than female authors, and the frequency proportion of the words is more balanced in the female corpus. As for changes in language usage over time, some notable differences in the frequency and collocation features of the words examined were identified: some of the words collocate with more positive words in the 2nd subcorpora than in the 1st, which points to the semantic change of these words over time.

Keywords: gender differences, negative emotive words, semantic changes over time, twitter

Procedia PDF Downloads 105
37 Bidirectional Encoder Representations from Transformers Sentiment Analysis Applied to Three Presidential Pre-Candidates in Costa Rica

Authors: Félix David Suárez Bonilla


A sentiment analysis service to detect polarity (positive, neural, and negative), based on transfer learning, was built using a Spanish version of BERT and applied to tweets written in Spanish. The dataset that was used consisted of 11975 reviews, which were extracted from Google Play using the google-play-scrapper package. The BETO trained model used: the AdamW optimizer, a batch size of 16, a learning rate of 2x10⁻⁵ and 10 epochs. The system was tested using tweets of three presidential pre-candidates from Costa Rica. The system was finally validated using human labeled examples, achieving an accuracy of 83.3%.

Keywords: NLP, transfer learning, BERT, sentiment analysis, social media, opinion mining

Procedia PDF Downloads 44
36 Topic Sentiments toward the COVID-19 Vaccine on Twitter

Authors: Melissa Vang, Raheyma Khan, Haihua Chen


The coronavirus disease 2019 (COVID‐19) pandemic has changed people's lives from all over the world. More people have turned to Twitter to engage online and discuss the COVID-19 vaccine. This study aims to present a text mining approach to identify people's attitudes towards the COVID-19 vaccine on Twitter. To achieve this purpose, we collected 54,268 COVID-19 vaccine tweets from September 01, 2020, to November 01, 2020, then the BERT model is used for the sentiment and topic analysis. The results show that people had more negative than positive attitudes about the vaccine, and countries with an increasing number of confirmed cases had a higher percentage of negative attitudes. Additionally, the topics discussed in positive and negative tweets are different. The tweet datasets can be helpful to information professionals to inform the public about vaccine-related informational resources. Our findings may have implications for understanding people's cognitions and feelings about the vaccine.

Keywords: BERT, COVID-19 vaccine, sentiment analysis, topic modeling

Procedia PDF Downloads 59
35 Tracing Digital Traces of Phatic Communion in #Mooc

Authors: Judith Enriquez-Gibson


This paper meddles with the notion of phatic communion introduced 90 years ago by Malinowski, who was a Polish-born British anthropologist. It explores the phatic in Twitter within the contents of tweets related to moocs (massive online open courses) as a topic or trend. It is not about moocs though. It is about practices that could easily be hidden or neglected if we let big or massive topics take the lead or if we simply follow the computational or secret codes behind Twitter itself and third party software analytics. It draws from media and cultural studies. Though at first it appears data-driven as I submitted data collection and analytics into the hands of a third party software, Twitonomy, the aim is to follow how phatic communion might be practised in a social media site, such as Twitter. Lurking becomes its research method to analyse mooc-related tweets. A total of 3,000 tweets were collected on 11 October 2013 (UK timezone). The emphasis of lurking is to engage with Twitter as a system of connectivity. One interesting finding is that a click is in fact a phatic practice. A click breaks the silence. A click in one of the mooc website is actually a tweet. A tweet was posted on behalf of a user who simply chose to click without formulating the text and perhaps without knowing that it contains #mooc. Surely, this mechanism is not about reciprocity. To break the silence, users did not use words. They just clicked the ‘tweet button’ on a mooc website. A click performs and maintains connectivity – and Twitter as the medium in attendance in our everyday, available when needed to be of service. In conclusion, the phatic culture of breaking silence in Twitter does not have to submit to the power of code and analytics. It is a matter of human code.

Keywords: click, Twitter, phatic communion, social media data, mooc

Procedia PDF Downloads 304
34 Analysing Social Media Coverage of Political Speeches in Relation to Discourse and Context

Authors: Yaser Mohammed Altameemi


This research looks at the representation of the social media for the Saudi Government decrees regarding the developmental projects of the Saudi 2030 vision. The paper analyses a television interview with the Crown Prince Mohammed Bin Salman who talks about the progress of the Saudi vision of 2030, and how the government had acted as response to the COVID-19 pandemic. The interview was on 28/4/2021. The paper analyses the tweets on Twitter that cover the interview for the purpose of investigating the development of concepts and meanings regarding the Saudi peoples’ orientations towards the Saudi projects. The data include all related tweets from the day of the interview and the following seven days after the interview. The finding of the collocation analysis suggests that nationalism notion is explicitly expressed by users in Twitter. The main finding of this paper suggests the importance of further analyses for the concordance lines. However, the collocation network suggests that there is a clear highlight for nationalism.

Keywords: social media, twitter, political interview, prince Mohammed Bin Salman, Saudi vision 2030

Procedia PDF Downloads 18
33 Sentiment Mapping through Social Media and Its Implications

Authors: G. C. Joshi, M. Paul, B. K. Kalita, V. Ranga, J. S. Rawat, P. S. Rawat


Being a habitat of the global village, every place has established connection through the strength and power of social media piercing through the political boundaries. Social media is a digital platform, where people across the world can interact as it has advantages of being universal, anonymous, easily accessible, indirect interaction, gathering and sharing information. The power of social media lies in the intensity of sharing extreme opinions or feelings, in contrast to the personal interactions which can be easily mapped in the form of Sentiment Mapping. The easy access to social networking sites such as Facebook, Twitter and blogs made unprecedented opportunities for citizens to voice their opinions loaded with dynamics of emotions. These further influence human thoughts where social media plays a very active role. A recent incident of public importance was selected as a case study to map the sentiments of people through Twitter. Understanding those dynamics through the eye of an ordinary people can be challenging. With the help of R-programming language and by the aid of GIS techniques sentiment maps has been produced. The emotions flowing worldwide in the form of tweets were extracted and analyzed. The number of tweets had diminished by 91 % from 25/08/2017 to 31/08/2017. A boom of sentiments emerged near the origin of the case, i.e., Delhi, Haryana and Punjab and the capital showed maximum influence resulting in spillover effect near Delhi. The trend of sentiments was prevailing more as neutral (45.37%), negative (28.6%) and positive (21.6%) after calculating the sentiment scores of the tweets. The result can be used to know the spatial distribution of digital penetration in India, where highest concentration lies in Mumbai and lowest in North East India and Jammu and Kashmir.

Keywords: sentiment mapping, digital literacy, GIS, R statistical language, spatio-temporal

Procedia PDF Downloads 81
32 Factors Promoting French-English Tweets in France

Authors: Taoues Hadour


Twitter has become a popular means of communication used in a variety of fields, such as politics, journalism, and academia. This widely used online platform has an impact on the way people express themselves and is changing language usage worldwide at an unprecedented pace. The language used online reflects the linguistic battle that has been going on for several decades in French society. This study enables a deeper understanding of users' linguistic behavior online. The implications are important and allow for a rise in awareness of intercultural and cross-language exchanges. This project investigates the mixing of French-English language usage among French users of Twitter using a topic analysis approach. This analysis draws on Gumperz's theory of conversational switching. In order to collect tweets at a large scale, the data was collected in R using the rtweet package to access and retrieve French tweets data through Twitter’s REST and stream APIs (Application Program Interface) using the software RStudio, the integrated development environment for R. The dataset was filtered manually and certain repetitions of themes were observed. A total of nine topic categories were identified and analyzed in this study: entertainment, internet/social media, events/community, politics/news, sports, sex/pornography, innovation/technology, fashion/make up, and business. The study reveals that entertainment is the most frequent topic discussed on Twitter. Entertainment includes movies, music, games, and books. Anglicisms such as trailer, spoil, and live are identified in the data. Change in language usage is inevitable and is a natural result of linguistic interactions. The use of different languages online is just an example of what the real world would look like without linguistic regulations. Social media reveals a multicultural and multilinguistic richness which can deepen and expand our understanding of contemporary human attitudes.

Keywords: code-switching, French, sociolinguistics, Twitter

Procedia PDF Downloads 48
31 An Approach for Pattern Recognition and Prediction of Information Diffusion Model on Twitter

Authors: Amartya Hatua, Trung Nguyen, Andrew Sung


In this paper, we study the information diffusion process on Twitter as a multivariate time series problem. Our model concerns three measures (volume, network influence, and sentiment of tweets) based on 10 features, and we collected 27 million tweets to build our information diffusion time series dataset for analysis. Then, different time series clustering techniques with Dynamic Time Warping (DTW) distance were used to identify different patterns of information diffusion. Finally, we built the information diffusion prediction models for new hashtags which comprise two phrases: The first phrase is recognizing the pattern using k-NN with DTW distance; the second phrase is building the forecasting model using the traditional Autoregressive Integrated Moving Average (ARIMA) model and the non-linear recurrent neural network of Long Short-Term Memory (LSTM). Preliminary results of performance evaluation between different forecasting models show that LSTM with clustering information notably outperforms other models. Therefore, our approach can be applied in real-world applications to analyze and predict the information diffusion characteristics of selected topics or memes (hashtags) in Twitter.

Keywords: ARIMA, DTW, information diffusion, LSTM, RNN, time series clustering, time series forecasting, Twitter

Procedia PDF Downloads 307
30 Investigating Non-suicidal Self-Injury Discussions on Twitter

Authors: Muhammad Abubakar Alhassan, Diane Pennington


Social networking sites have become a space for people to discuss public health issues such as non-suicidal self-injury (NSSI). There are thousands of tweets containing self-harm and self-injury hashtags on Twitter. It is difficult to distinguish between different users who participate in self-injury discussions on Twitter and how their opinions change over time. Also, it is challenging to understand the topics surrounding NSSI discussions on Twitter. We retrieved tweets using #selfham and #selfinjury hashtags and investigated those from the United kingdom. We applied inductive coding and grouped tweeters into different categories. This study used the Latent Dirichlet Allocation (LDA) algorithm to infer the optimum number of topics that describes our corpus. Our findings revealed that many of those participating in NSSI discussions are non-professional users as opposed to medical experts and academics. Support organisations, medical teams, and academics were campaigning positively on rais-ing self-injury awareness and recovery. Using LDAvis visualisation technique, we selected the top 20 most relevant terms from each topic and interpreted the topics as; children and youth well-being, self-harm misjudgement, mental health awareness, school and mental health support and, suicide and mental-health issues. More than 50% of these topics were discussed in England compared to Scotland, Wales, Ireland and Northern Ireland. Our findings highlight the advantages of using the Twitter social network in tackling the problem of self-injury through awareness. There is a need to study the potential risks associated with the use of social networks among self-injurers.

Keywords: self-harm, non-suicidal self-injury, Twitter, social networks

Procedia PDF Downloads 50
29 Real Time Classification of Political Tendency of Twitter Spanish Users based on Sentiment Analysis

Authors: Marc Solé, Francesc Giné, Magda Valls, Nina Bijedic


What people say on social media has turned into a rich source of information to understand social behavior. Specifically, the growing use of Twitter social media for political communication has arisen high opportunities to know the opinion of large numbers of politically active individuals in real time and predict the global political tendencies of a specific country. It has led to an increasing body of research on this topic. The majority of these studies have been focused on polarized political contexts characterized by only two alternatives. Unlike them, this paper tackles the challenge of forecasting Spanish political trends, characterized by multiple political parties, by means of analyzing the Twitters Users political tendency. According to this, a new strategy, named Tweets Analysis Strategy (TAS), is proposed. This is based on analyzing the users tweets by means of discovering its sentiment (positive, negative or neutral) and classifying them according to the political party they support. From this individual political tendency, the global political prediction for each political party is calculated. In order to do this, two different strategies for analyzing the sentiment analysis are proposed: one is based on Positive and Negative words Matching (PNM) and the second one is based on a Neural Networks Strategy (NNS). The complete TAS strategy has been performed in a Big-Data environment. The experimental results presented in this paper reveal that NNS strategy performs much better than PNM strategy to analyze the tweet sentiment. In addition, this research analyzes the viability of the TAS strategy to obtain the global trend in a political context make up by multiple parties with an error lower than 23%.

Keywords: political tendency, prediction, sentiment analysis, Twitter

Procedia PDF Downloads 155
28 Interacting with Multi-Scale Structures of Online Political Debates by Visualizing Phylomemies

Authors: Quentin Lobbe, David Chavalarias, Alexandre Delanoe


The ICT revolution has given birth to an unprecedented world of digital traces and has impacted a wide number of knowledge-driven domains such as science, education or policy making. Nowadays, we are daily fueled by unlimited flows of articles, blogs, messages, tweets, etc. The internet itself can thus be considered as an unsteady hyper-textual environment where websites emerge and expand every day. But there are structures inside knowledge. A given text can always be studied in relation to others or in light of a specific socio-cultural context. By way of their textual traces, human beings are calling each other out: hypertext citations, retweets, vocabulary similarity, etc. We are in fact the architects of a giant web of elements of knowledge whose structures and shapes convey their own information. The global shapes of these digital traces represent a source of collective knowledge and the question of their visualization remains an opened challenge. How can we explore, browse and interact with such shapes? In order to navigate across these growing constellations of words and texts, interdisciplinary innovations are emerging at the crossroad between fields of social and computational sciences. In particular, complex systems approaches make it now possible to reconstruct the hidden structures of textual knowledge by means of multi-scale objects of research such as semantic maps and phylomemies. The phylomemy reconstruction is a generic method related to the co-word analysis framework. Phylomemies aim to reveal the temporal dynamics of large corpora of textual contents by performing inter-temporal matching on extracted knowledge domains in order to identify their conceptual lineages. This study aims to address the question of visualizing the global shapes of online political discussions related to the French presidential and legislative elections of 2017. We aim to build phylomemies on top of a dedicated collection of thousands of French political tweets enriched with archived contemporary news web articles. Our goal is to reconstruct the temporal evolution of online debates fueled by each political community during the elections. To that end, we want to introduce an iterative data exploration methodology implemented and tested within the free software Gargantext. There we combine synchronic and diachronic axis of visualization to reveal the dynamics of our corpora of tweets and web pages as well as their inner syntagmatic and paradigmatic relationships. In doing so, we aim to provide researchers with innovative methodological means to explore online semantic landscapes in a collaborative and reflective way.

Keywords: online political debate, French election, hyper-text, phylomemy

Procedia PDF Downloads 94
27 Exploring Twitter Data on Human Rights Activism on Olympics Stage through Social Network Analysis and Mining

Authors: Teklu Urgessa, Joong Seek Lee


Social media is becoming the primary choice of activists to make their voices heard. This fact is coupled by two main reasons. The first reason is the emergence web 2.0, which gave the users opportunity to become content creators than passive recipients. Secondly the control of the mainstream mass media outlets by the governments and individuals with their political and economic interests. This paper aimed at exploring twitter data of network actors talking about the marathon silver medalists on Rio2016, who showed solidarity with the Oromo protesters in Ethiopia on the marathon race finish line when he won silver. The aim is to discover important insight using social network analysis and mining. The hashtag #FeyisaLelisa was used for Twitter network search. The actors’ network was visualized and analyzed. It showed the central influencers during first 10 days in August, were international media outlets while it was changed to individual activist in September. The degree distribution of the network is scale free where the frequency of degrees decay by power low. Text mining was also used to arrive at meaningful themes from tweet corpus about the event selected for analysis. The semantic network indicated important clusters of concepts (15) that provided different insight regarding the why, who, where, how of the situation related to the event. The sentiments of the words in the tweets were also analyzed and indicated that 95% of the opinions in the tweets were either positive or neutral. Overall, the finding showed that Olympic stage protest of the marathoner brought the issue of Oromo protest to the global stage. The new research framework is proposed based for event-based social network analysis and mining based on the practical procedures followed in this research for event-based social media sense making.

Keywords: human rights, Olympics, social media, network analysis, social network ming

Procedia PDF Downloads 178
26 Contextual Toxicity Detection with Data Augmentation

Authors: Julia Ive, Lucia Specia


Understanding and detecting toxicity is an important problem to support safer human interactions online. Our work focuses on the important problem of contextual toxicity detection, where automated classifiers are tasked with determining whether a short textual segment (usually a sentence) is toxic within its conversational context. We use “toxicity” as an umbrella term to denote a number of variants commonly named in the literature, including hate, abuse, offence, among others. Detecting toxicity in context is a non-trivial problem and has been addressed by very few previous studies. These previous studies have analysed the influence of conversational context in human perception of toxicity in controlled experiments and concluded that humans rarely change their judgements in the presence of context. They have also evaluated contextual detection models based on state-of-the-art Deep Learning and Natural Language Processing (NLP) techniques. Counterintuitively, they reached the general conclusion that computational models tend to suffer performance degradation in the presence of context. We challenge these empirical observations by devising better contextual predictive models that also rely on NLP data augmentation techniques to create larger and better data. In our study, we start by further analysing the human perception of toxicity in conversational data (i.e., tweets), in the absence versus presence of context, in this case, previous tweets in the same conversational thread. We observed that the conclusions of previous work on human perception are mainly due to data issues: The contextual data available does not provide sufficient evidence that context is indeed important (even for humans). The data problem is common in current toxicity datasets: cases labelled as toxic are either obviously toxic (i.e., overt toxicity with swear, racist, etc. words), and thus context does is not needed for a decision, or are ambiguous, vague or unclear even in the presence of context; in addition, the data contains labeling inconsistencies. To address this problem, we propose to automatically generate contextual samples where toxicity is not obvious (i.e., covert cases) without context or where different contexts can lead to different toxicity judgements for the same tweet. We generate toxic and non-toxic utterances conditioned on the context or on target tweets using a range of techniques for controlled text generation(e.g., Generative Adversarial Networks and steering techniques). On the contextual detection models, we posit that their poor performance is due to limitations on both of the data they are trained on (same problems stated above) and the architectures they use, which are not able to leverage context in effective ways. To improve on that, we propose text classification architectures that take the hierarchy of conversational utterances into account. In experiments benchmarking ours against previous models on existing and automatically generated data, we show that both data and architectural choices are very important. Our model achieves substantial performance improvements as compared to the baselines that are non-contextual or contextual but agnostic of the conversation structure.

Keywords: contextual toxicity detection, data augmentation, hierarchical text classification models, natural language processing

Procedia PDF Downloads 81
25 Effects of Twitter Interactions on Self-Esteem and Narcissistic Behaviour

Authors: Leena-Maria Alyedreessy


Self-esteem is thought to be determined by ones’ own feeling of being included, liked and accepted by others. This research explores whether this concept may also be applied in the virtual world and assesses whether there is any relationship between Twitter users' self-esteem and the amount of interactions they receive. 20 female Arab participants were given a survey asking them about their Twitter interactions and their feelings of having an imagined audience to fill out and a Rosenberg Self-Esteem Assessment to complete. After completion and statistical analysis, results showed a significant correlation between the feeling of being Twitter elite, the feeling of having a lot of people listening to your tweets and having a lot of interactions with high self-esteem. However, no correlations were detected for low-self-esteem and low interactions.

Keywords: twitter, social media, self-esteem, narcissism, interactions

Procedia PDF Downloads 333
24 Political Communication in Twitter Interactions between Government, News Media and Citizens in Mexico

Authors: Jorge Cortés, Alejandra Martínez, Carlos Pérez, Anaid Simón


The presence of government, news media, and general citizenry in social media allows considering interactions between them as a form of political communication (i.e. the public exchange of contradictory discourses about politics). Twitter’s asymmetrical following model (users can follow, mention or reply to other users that do not follow them) could foster alternative democratic practices and have an impact on Mexican political culture, which has been marked by a lack of direct communication channels between these actors. The research aim is to assess Twitter’s role in political communication practices through the analysis of interaction dynamics between government, news media, and citizens by extracting and visualizing data from Twitter’s API to observe general behavior patterns. The hypothesis is that regardless the fact that Twitter’s features enable direct and horizontal interactions between actors, users repeat traditional dynamics of interaction, without taking full advantage of the possibilities of this medium. Through an interdisciplinary team including Communication Strategies, Information Design, and Interaction Systems, the activity on Twitter generated by the controversy over the presence of Uber in Mexico City was analysed; an issue of public interest, involving aspects such as public opinion, economic interests and a legal dimension. This research includes techniques from social network analysis (SNA), a methodological approach focused on the comprehension of the relationships between actors through the visual representation and measurement of network characteristics. The analysis of the Uber event comprised data extraction, data categorization, corpus construction, corpus visualization and analysis. On the recovery stage TAGS, a Google Sheet template, was used to extract tweets that included the hashtags #UberSeQueda and #UberSeVa, posts containing the string Uber and tweets directed to @uber_mx. Using scripts written in Python, the data was filtered, discarding tweets with no interaction (replies, retweets or mentions) and locations outside of México. Considerations regarding bots and the omission of anecdotal posts were also taken into account. The utility of graphs to observe interactions of political communication in general was confirmed by the analysis of visualizations generated with programs such as Gephi and NodeXL. However, some aspects require improvements to obtain more useful visual representations for this type of research. For example, link¬crossings complicates following the direction of an interaction forcing users to manipulate the graph to see it clearly. It was concluded that some practices prevalent in political communication in Mexico are replicated in Twitter. Media actors tend to group together instead of interact with others. The political system tends to tweet as an advertising strategy rather than to generate dialogue. However, some actors were identified as bridges establishing communication between the three spheres, generating a more democratic exercise and taking advantage of Twitter’s possibilities. Although interactions in Twitter could become an alternative to political communication, this potential depends on the intentions of the participants and to what extent they are aiming for collaborative and direct communications. Further research is needed to get a deeper understanding on the political behavior of Twitter users and the possibilities of SNA for its analysis.

Keywords: interaction, political communication, social network analysis, Twitter

Procedia PDF Downloads 139