Search results for: Tweets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 73

Search results for: Tweets

43 Investigating Non-suicidal Self-Injury Discussions on Twitter

Authors: Muhammad Abubakar Alhassan, Diane Pennington

Abstract:

Social networking sites have become a space for people to discuss public health issues such as non-suicidal self-injury (NSSI). There are thousands of tweets containing self-harm and self-injury hashtags on Twitter. It is difficult to distinguish between different users who participate in self-injury discussions on Twitter and how their opinions change over time. Also, it is challenging to understand the topics surrounding NSSI discussions on Twitter. We retrieved tweets using #selfham and #selfinjury hashtags and investigated those from the United kingdom. We applied inductive coding and grouped tweeters into different categories. This study used the Latent Dirichlet Allocation (LDA) algorithm to infer the optimum number of topics that describes our corpus. Our findings revealed that many of those participating in NSSI discussions are non-professional users as opposed to medical experts and academics. Support organisations, medical teams, and academics were campaigning positively on rais-ing self-injury awareness and recovery. Using LDAvis visualisation technique, we selected the top 20 most relevant terms from each topic and interpreted the topics as; children and youth well-being, self-harm misjudgement, mental health awareness, school and mental health support and, suicide and mental-health issues. More than 50% of these topics were discussed in England compared to Scotland, Wales, Ireland and Northern Ireland. Our findings highlight the advantages of using the Twitter social network in tackling the problem of self-injury through awareness. There is a need to study the potential risks associated with the use of social networks among self-injurers.

Keywords: self-harm, non-suicidal self-injury, Twitter, social networks

Procedia PDF Downloads 131
42 Real Time Classification of Political Tendency of Twitter Spanish Users based on Sentiment Analysis

Authors: Marc Solé, Francesc Giné, Magda Valls, Nina Bijedic

Abstract:

What people say on social media has turned into a rich source of information to understand social behavior. Specifically, the growing use of Twitter social media for political communication has arisen high opportunities to know the opinion of large numbers of politically active individuals in real time and predict the global political tendencies of a specific country. It has led to an increasing body of research on this topic. The majority of these studies have been focused on polarized political contexts characterized by only two alternatives. Unlike them, this paper tackles the challenge of forecasting Spanish political trends, characterized by multiple political parties, by means of analyzing the Twitters Users political tendency. According to this, a new strategy, named Tweets Analysis Strategy (TAS), is proposed. This is based on analyzing the users tweets by means of discovering its sentiment (positive, negative or neutral) and classifying them according to the political party they support. From this individual political tendency, the global political prediction for each political party is calculated. In order to do this, two different strategies for analyzing the sentiment analysis are proposed: one is based on Positive and Negative words Matching (PNM) and the second one is based on a Neural Networks Strategy (NNS). The complete TAS strategy has been performed in a Big-Data environment. The experimental results presented in this paper reveal that NNS strategy performs much better than PNM strategy to analyze the tweet sentiment. In addition, this research analyzes the viability of the TAS strategy to obtain the global trend in a political context make up by multiple parties with an error lower than 23%.

Keywords: political tendency, prediction, sentiment analysis, Twitter

Procedia PDF Downloads 238
41 Interacting with Multi-Scale Structures of Online Political Debates by Visualizing Phylomemies

Authors: Quentin Lobbe, David Chavalarias, Alexandre Delanoe

Abstract:

The ICT revolution has given birth to an unprecedented world of digital traces and has impacted a wide number of knowledge-driven domains such as science, education or policy making. Nowadays, we are daily fueled by unlimited flows of articles, blogs, messages, tweets, etc. The internet itself can thus be considered as an unsteady hyper-textual environment where websites emerge and expand every day. But there are structures inside knowledge. A given text can always be studied in relation to others or in light of a specific socio-cultural context. By way of their textual traces, human beings are calling each other out: hypertext citations, retweets, vocabulary similarity, etc. We are in fact the architects of a giant web of elements of knowledge whose structures and shapes convey their own information. The global shapes of these digital traces represent a source of collective knowledge and the question of their visualization remains an opened challenge. How can we explore, browse and interact with such shapes? In order to navigate across these growing constellations of words and texts, interdisciplinary innovations are emerging at the crossroad between fields of social and computational sciences. In particular, complex systems approaches make it now possible to reconstruct the hidden structures of textual knowledge by means of multi-scale objects of research such as semantic maps and phylomemies. The phylomemy reconstruction is a generic method related to the co-word analysis framework. Phylomemies aim to reveal the temporal dynamics of large corpora of textual contents by performing inter-temporal matching on extracted knowledge domains in order to identify their conceptual lineages. This study aims to address the question of visualizing the global shapes of online political discussions related to the French presidential and legislative elections of 2017. We aim to build phylomemies on top of a dedicated collection of thousands of French political tweets enriched with archived contemporary news web articles. Our goal is to reconstruct the temporal evolution of online debates fueled by each political community during the elections. To that end, we want to introduce an iterative data exploration methodology implemented and tested within the free software Gargantext. There we combine synchronic and diachronic axis of visualization to reveal the dynamics of our corpora of tweets and web pages as well as their inner syntagmatic and paradigmatic relationships. In doing so, we aim to provide researchers with innovative methodological means to explore online semantic landscapes in a collaborative and reflective way.

Keywords: online political debate, French election, hyper-text, phylomemy

Procedia PDF Downloads 186
40 Exploring Twitter Data on Human Rights Activism on Olympics Stage through Social Network Analysis and Mining

Authors: Teklu Urgessa, Joong Seek Lee

Abstract:

Social media is becoming the primary choice of activists to make their voices heard. This fact is coupled by two main reasons. The first reason is the emergence web 2.0, which gave the users opportunity to become content creators than passive recipients. Secondly the control of the mainstream mass media outlets by the governments and individuals with their political and economic interests. This paper aimed at exploring twitter data of network actors talking about the marathon silver medalists on Rio2016, who showed solidarity with the Oromo protesters in Ethiopia on the marathon race finish line when he won silver. The aim is to discover important insight using social network analysis and mining. The hashtag #FeyisaLelisa was used for Twitter network search. The actors’ network was visualized and analyzed. It showed the central influencers during first 10 days in August, were international media outlets while it was changed to individual activist in September. The degree distribution of the network is scale free where the frequency of degrees decay by power low. Text mining was also used to arrive at meaningful themes from tweet corpus about the event selected for analysis. The semantic network indicated important clusters of concepts (15) that provided different insight regarding the why, who, where, how of the situation related to the event. The sentiments of the words in the tweets were also analyzed and indicated that 95% of the opinions in the tweets were either positive or neutral. Overall, the finding showed that Olympic stage protest of the marathoner brought the issue of Oromo protest to the global stage. The new research framework is proposed based for event-based social network analysis and mining based on the practical procedures followed in this research for event-based social media sense making.

Keywords: human rights, Olympics, social media, network analysis, social network ming

Procedia PDF Downloads 257
39 Contextual Toxicity Detection with Data Augmentation

Authors: Julia Ive, Lucia Specia

Abstract:

Understanding and detecting toxicity is an important problem to support safer human interactions online. Our work focuses on the important problem of contextual toxicity detection, where automated classifiers are tasked with determining whether a short textual segment (usually a sentence) is toxic within its conversational context. We use “toxicity” as an umbrella term to denote a number of variants commonly named in the literature, including hate, abuse, offence, among others. Detecting toxicity in context is a non-trivial problem and has been addressed by very few previous studies. These previous studies have analysed the influence of conversational context in human perception of toxicity in controlled experiments and concluded that humans rarely change their judgements in the presence of context. They have also evaluated contextual detection models based on state-of-the-art Deep Learning and Natural Language Processing (NLP) techniques. Counterintuitively, they reached the general conclusion that computational models tend to suffer performance degradation in the presence of context. We challenge these empirical observations by devising better contextual predictive models that also rely on NLP data augmentation techniques to create larger and better data. In our study, we start by further analysing the human perception of toxicity in conversational data (i.e., tweets), in the absence versus presence of context, in this case, previous tweets in the same conversational thread. We observed that the conclusions of previous work on human perception are mainly due to data issues: The contextual data available does not provide sufficient evidence that context is indeed important (even for humans). The data problem is common in current toxicity datasets: cases labelled as toxic are either obviously toxic (i.e., overt toxicity with swear, racist, etc. words), and thus context does is not needed for a decision, or are ambiguous, vague or unclear even in the presence of context; in addition, the data contains labeling inconsistencies. To address this problem, we propose to automatically generate contextual samples where toxicity is not obvious (i.e., covert cases) without context or where different contexts can lead to different toxicity judgements for the same tweet. We generate toxic and non-toxic utterances conditioned on the context or on target tweets using a range of techniques for controlled text generation(e.g., Generative Adversarial Networks and steering techniques). On the contextual detection models, we posit that their poor performance is due to limitations on both of the data they are trained on (same problems stated above) and the architectures they use, which are not able to leverage context in effective ways. To improve on that, we propose text classification architectures that take the hierarchy of conversational utterances into account. In experiments benchmarking ours against previous models on existing and automatically generated data, we show that both data and architectural choices are very important. Our model achieves substantial performance improvements as compared to the baselines that are non-contextual or contextual but agnostic of the conversation structure.

Keywords: contextual toxicity detection, data augmentation, hierarchical text classification models, natural language processing

Procedia PDF Downloads 170
38 Resale Housing Development Board Price Prediction Considering Covid-19 through Sentiment Analysis

Authors: Srinaath Anbu Durai, Wang Zhaoxia

Abstract:

Twitter sentiment has been used as a predictor to predict price values or trends in both the stock market and housing market. The pioneering works in this stream of research drew upon works in behavioural economics to show that sentiment or emotions impact economic decisions. Latest works in this stream focus on the algorithm used as opposed to the data used. A literature review of works in this stream through the lens of data used shows that there is a paucity of work that considers the impact of sentiments caused due to an external factor on either the stock or the housing market. This is despite an abundance of works in behavioural economics that show that sentiment or emotions caused due to an external factor impact economic decisions. To address this gap, this research studies the impact of Twitter sentiment pertaining to the Covid-19 pandemic on resale Housing Development Board (HDB) apartment prices in Singapore. It leverages SNSCRAPE to collect tweets pertaining to Covid-19 for sentiment analysis, lexicon based tools VADER and TextBlob are used for sentiment analysis, Granger Causality is used to examine the relationship between Covid-19 cases and the sentiment score, and neural networks are leveraged as prediction models. Twitter sentiment pertaining to Covid-19 as a predictor of HDB price in Singapore is studied in comparison with the traditional predictors of housing prices i.e., the structural and neighbourhood characteristics. The results indicate that using Twitter sentiment pertaining to Covid19 leads to better prediction than using only the traditional predictors and performs better as a predictor compared to two of the traditional predictors. Hence, Twitter sentiment pertaining to an external factor should be considered as important as traditional predictors. This paper demonstrates the real world economic applications of sentiment analysis of Twitter data.

Keywords: sentiment analysis, Covid-19, housing price prediction, tweets, social media, Singapore HDB, behavioral economics, neural networks

Procedia PDF Downloads 115
37 Effects of Twitter Interactions on Self-Esteem and Narcissistic Behaviour

Authors: Leena-Maria Alyedreessy

Abstract:

Self-esteem is thought to be determined by ones’ own feeling of being included, liked and accepted by others. This research explores whether this concept may also be applied in the virtual world and assesses whether there is any relationship between Twitter users' self-esteem and the amount of interactions they receive. 20 female Arab participants were given a survey asking them about their Twitter interactions and their feelings of having an imagined audience to fill out and a Rosenberg Self-Esteem Assessment to complete. After completion and statistical analysis, results showed a significant correlation between the feeling of being Twitter elite, the feeling of having a lot of people listening to your tweets and having a lot of interactions with high self-esteem. However, no correlations were detected for low-self-esteem and low interactions.

Keywords: twitter, social media, self-esteem, narcissism, interactions

Procedia PDF Downloads 412
36 Political Communication in Twitter Interactions between Government, News Media and Citizens in Mexico

Authors: Jorge Cortés, Alejandra Martínez, Carlos Pérez, Anaid Simón

Abstract:

The presence of government, news media, and general citizenry in social media allows considering interactions between them as a form of political communication (i.e. the public exchange of contradictory discourses about politics). Twitter’s asymmetrical following model (users can follow, mention or reply to other users that do not follow them) could foster alternative democratic practices and have an impact on Mexican political culture, which has been marked by a lack of direct communication channels between these actors. The research aim is to assess Twitter’s role in political communication practices through the analysis of interaction dynamics between government, news media, and citizens by extracting and visualizing data from Twitter’s API to observe general behavior patterns. The hypothesis is that regardless the fact that Twitter’s features enable direct and horizontal interactions between actors, users repeat traditional dynamics of interaction, without taking full advantage of the possibilities of this medium. Through an interdisciplinary team including Communication Strategies, Information Design, and Interaction Systems, the activity on Twitter generated by the controversy over the presence of Uber in Mexico City was analysed; an issue of public interest, involving aspects such as public opinion, economic interests and a legal dimension. This research includes techniques from social network analysis (SNA), a methodological approach focused on the comprehension of the relationships between actors through the visual representation and measurement of network characteristics. The analysis of the Uber event comprised data extraction, data categorization, corpus construction, corpus visualization and analysis. On the recovery stage TAGS, a Google Sheet template, was used to extract tweets that included the hashtags #UberSeQueda and #UberSeVa, posts containing the string Uber and tweets directed to @uber_mx. Using scripts written in Python, the data was filtered, discarding tweets with no interaction (replies, retweets or mentions) and locations outside of México. Considerations regarding bots and the omission of anecdotal posts were also taken into account. The utility of graphs to observe interactions of political communication in general was confirmed by the analysis of visualizations generated with programs such as Gephi and NodeXL. However, some aspects require improvements to obtain more useful visual representations for this type of research. For example, link¬crossings complicates following the direction of an interaction forcing users to manipulate the graph to see it clearly. It was concluded that some practices prevalent in political communication in Mexico are replicated in Twitter. Media actors tend to group together instead of interact with others. The political system tends to tweet as an advertising strategy rather than to generate dialogue. However, some actors were identified as bridges establishing communication between the three spheres, generating a more democratic exercise and taking advantage of Twitter’s possibilities. Although interactions in Twitter could become an alternative to political communication, this potential depends on the intentions of the participants and to what extent they are aiming for collaborative and direct communications. Further research is needed to get a deeper understanding on the political behavior of Twitter users and the possibilities of SNA for its analysis.

Keywords: interaction, political communication, social network analysis, Twitter

Procedia PDF Downloads 221
35 Changing the Biopower Hierarchy between Women’s Bodily Knowledge and the Medical Knowledge about the Body: The Case of Female Ejaculation and #Notpee

Authors: Lior B. Navon

Abstract:

The objective of this study is to investigate how technology, such as social media, can influence the biopower hierarchy between the medical knowledge about the body and women’s bodily knowledge through the case study of the hashtag 'notpee'. In January 2015, the hashtag #notpee, relating to a feminine physiological phenomenon called female ejaculation (FE) or squirting (SQ) started circulating on twitter. This hashtag, born as a reaction to a medical study claiming that SQ is essentially involuntary emission of urine during sexual activity, sparked an unusual public discourse about FE, a phenomenon that is usually not discussed or referred to in socio-legitimate public spheres. This unusual backlash got the attention of women’s magazines and blogs, as well as more mainstream large and respected outlets such as The Guardian and CNN. Both the tweets on twitter, as well as the media coverage of them, were mainly aimed at rejecting the research’s findings. While not offering an alternative and choosing to define the phenomenon by negation, women argued that the fluid extracted was not pee based on their personal experiences. Based on a critical discourse analysis of 742 tweets with the hashtag 'notpee' between January 2015 and January 2016, and of 15 articles covering the backlash, this study suggests that the #notpee backlash challenged the power balance between the medical knowledge about the feminine body and the feminine bodily knowledge through two different, yet related, forms of resistance to biopower. The first resistance is to the authority over knowledge production — who has the power to produce 'true' statements when it comes to the body? Is it the women who experience the phenomenon, or is it the medical institution? The second resistance to biopower has to do with what we regard as facts or veracity. A critical discourse analysis reveals that while both the scientific field, as well as the women arguing against its findings, use empirical information, they, nevertheless, rely on two dichotomic databases- while the scientific research relies on samples from the 'dead like body', these woman are relying on their lived subjective senses as a source for fact making. Nevertheless, while #notpee is asking to change the power relations between the feminine subjective bodily knowledge and the seemingly objective masculine medical knowledge about the body, it by no means dismisses it. These women are essentially asking the medical institution to take into consideration the subjective body as well as the objective one while acknowledging and accepting the power of the latter over knowledge production.

Keywords: biopower, female ejaculation, new media, bodily knowledge

Procedia PDF Downloads 157
34 Poultry in Motion: Text Mining Social Media Data for Avian Influenza Surveillance in the UK

Authors: Samuel Munaf, Kevin Swingler, Franz Brülisauer, Anthony O’Hare, George Gunn, Aaron Reeves

Abstract:

Background: Avian influenza, more commonly known as Bird flu, is a viral zoonotic respiratory disease stemming from various species of poultry, including pets and migratory birds. Researchers have purported that the accessibility of health information online, in addition to the low-cost data collection methods the internet provides, has revolutionized the methods in which epidemiological and disease surveillance data is utilized. This paper examines the feasibility of using internet data sources, such as Twitter and livestock forums, for the early detection of the avian flu outbreak, through the use of text mining algorithms and social network analysis. Methods: Social media mining was conducted on Twitter between the period of 01/01/2021 to 31/12/2021 via the Twitter API in Python. The results were filtered firstly by hashtags (#avianflu, #birdflu), word occurrences (avian flu, bird flu, H5N1), and then refined further by location to include only those results from within the UK. Analysis was conducted on this text in a time-series manner to determine keyword frequencies and topic modeling to uncover insights in the text prior to a confirmed outbreak. Further analysis was performed by examining clinical signs (e.g., swollen head, blue comb, dullness) within the time series prior to the confirmed avian flu outbreak by the Animal and Plant Health Agency (APHA). Results: The increased search results in Google and avian flu-related tweets showed a correlation in time with the confirmed cases. Topic modeling uncovered clusters of word occurrences relating to livestock biosecurity, disposal of dead birds, and prevention measures. Conclusions: Text mining social media data can prove to be useful in relation to analysing discussed topics for epidemiological surveillance purposes, especially given the lack of applied research in the veterinary domain. The small sample size of tweets for certain weekly time periods makes it difficult to provide statistically plausible results, in addition to a great amount of textual noise in the data.

Keywords: veterinary epidemiology, disease surveillance, infodemiology, infoveillance, avian influenza, social media

Procedia PDF Downloads 105
33 1/Sigma Term Weighting Scheme for Sentiment Analysis

Authors: Hanan Alshaher, Jinsheng Xu

Abstract:

Large amounts of data on the web can provide valuable information. For example, product reviews help business owners measure customer satisfaction. Sentiment analysis classifies texts into two polarities: positive and negative. This paper examines movie reviews and tweets using a new term weighting scheme, called one-over-sigma (1/sigma), on benchmark datasets for sentiment classification. The proposed method aims to improve the performance of sentiment classification. The results show that 1/sigma is more accurate than the popular term weighting schemes. In order to verify if the entropy reflects the discriminating power of terms, we report a comparison of entropy values for different term weighting schemes.

Keywords: 1/sigma, natural language processing, sentiment analysis, term weighting scheme, text classification

Procedia PDF Downloads 201
32 The Use of Emoticons in Polite Phrases of Greeting and Thanks

Authors: Zuzana Komrsková

Abstract:

This paper shows the connection between emoticons and politeness in written computer-mediated communication. It studies if there are some differences in the use of emoticon between Czech and English written tweets. My assumptions about the use of emoticons were based on the use of greetings and thanks in real, face to face situations. The first assumption, that welcome greeting phrase would be accompanied by positive emoticon was correct. But for the farewell greeting both positive and negative emoticons are possible. My results show lower frequency of negative emoticons in this context. I also found quite often both positive and negative emoticon in the same tweet. The expression of gratitude is associated with positive emotions. The results show that emoticons accompany polite phrases of greeting and thanks very often both in Czech and English. The use of emoticons with studied polite phrases shows that emoticons have become an integral part of these phrases.

Keywords: Czech, emoticon, english, politeness, twitter

Procedia PDF Downloads 405
31 Analyzing Global User Sentiments on Laptop Features: A Comparative Study of Preferences Across Economic Contexts

Authors: Mohammadreza Bakhtiari, Mehrdad Maghsoudi, Hamidreza Bakhtiari

Abstract:

The widespread adoption of laptops has become essential to modern lifestyles, supporting work, education, and entertainment. Social media platforms have emerged as key spaces where users share real-time feedback on laptop performance, providing a valuable source of data for understanding consumer preferences. This study leverages aspect-based sentiment analysis (ABSA) on 1.5 million tweets to examine how users from developed and developing countries perceive and prioritize 16 key laptop features. The analysis reveals that consumers in developing countries express higher satisfaction overall, emphasizing affordability, durability, and reliability. Conversely, users in developed countries demonstrate more critical attitudes, especially toward performance-related aspects such as cooling systems, battery life, and chargers. The study employs a mixed-methods approach, combining ABSA using the PyABSA framework with expert insights gathered through a Delphi panel of ten industry professionals. Data preprocessing included cleaning, filtering, and aspect extraction from tweets. Universal issues such as battery efficiency and fan performance were identified, reflecting shared challenges across markets. However, priorities diverge between regions, while users in developed countries demand high-performance models with advanced features, those in developing countries seek products that offer strong value for money and long-term durability. The findings suggest that laptop manufacturers should adopt a market-specific strategy by developing differentiated product lines. For developed markets, the focus should be on cutting-edge technologies, enhanced cooling solutions, and comprehensive warranty services. In developing markets, emphasis should be placed on affordability, versatile port options, and robust designs. Additionally, the study highlights the importance of universal charging solutions and continuous sentiment monitoring to adapt to evolving consumer needs. This research offers practical insights for manufacturers seeking to optimize product development and marketing strategies for global markets, ensuring enhanced user satisfaction and long-term competitiveness. Future studies could explore multi-source data integration and conduct longitudinal analyses to capture changing trends over time.

Keywords: consumer behavior, durability, laptop industry, sentiment analysis, social media analytics

Procedia PDF Downloads 15
30 Sentiment Analysis of Consumers’ Perceptions on Social Media about the Main Mobile Providers in Jamaica

Authors: Sherrene Bogle, Verlia Bogle, Tyrone Anderson

Abstract:

In recent years, organizations have become increasingly interested in the possibility of analyzing social media as a means of gaining meaningful feedback about their products and services. The aspect based sentiment analysis approach is used to predict the sentiment for Twitter datasets for Digicel and Lime, the main mobile companies in Jamaica, using supervised learning classification techniques. The results indicate an average of 82.2 percent accuracy in classifying tweets when comparing three separate classification algorithms against the purported baseline of 70 percent and an average root mean squared error of 0.31. These results indicate that the analysis of sentiment on social media in order to gain customer feedback can be a viable solution for mobile companies looking to improve business performance.

Keywords: machine learning, sentiment analysis, social media, supervised learning

Procedia PDF Downloads 444
29 Learning Grammars for Detection of Disaster-Related Micro Events

Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev

Abstract:

Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.

Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter

Procedia PDF Downloads 478
28 Tweets to Touchdowns: Predicting National Football League Achievement from Social Media Optimism

Authors: Rohan Erasala, Ian McCulloh

Abstract:

The NFL Draft is a chance for every NFL team to select their next superstar. As a result, teams heavily invest in scouting, and millions of fans partake in the online discourse surrounding the draft. This paper investigates the potential correlations between positive sentiment in individual draft selection threads from the subreddit r/NFL and if this data can be used to make successful player recommendations. It is hypothesized that there will be limited correlations and nonviable recommendations made from these threads. The hypothesis is tested using sentiment analysis of draft thread comments and analyzing correlation and precision at k of top scores. The results indicate weak correlations between the percentage of positive comments in a draft selection thread and a player’s approximate value, but potentially viable recommendations from looking at players whose draft selection threads have the highest percentage of positive comments.

Keywords: national football league, NFL, NFL Draft, sentiment analysis, Reddit, social media, NLP

Procedia PDF Downloads 85
27 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 179
26 Quantifying Stability of Online Communities and Its Impact on Disinformation

Authors: Victor Chomel, Maziyar Panahi, David Chavalarias

Abstract:

Misinformation has taken an increasingly worrying place in social media. Propagation patterns are closely linked to the structure of communities. This study proposes a method of community analysis based on a combination of centrality indicators for the network and its main communities. The objective is to establish a link between the stability of the communities over time, the social ascension of its members internally, and the propagation of information in the community. To this end, data from the debates about global warming and political communities on Twitter have been collected, and several tens of millions of tweets and retweets have helped us better understand the structure of these communities. The quantification of this stability allows for the study of the propagation of information of any kind, including disinformation. Our results indicate that the most stable communities over time are the ones that enable the establishment of nodes capturing a large part of the information and broadcasting its opinions. Conversely, communities with a high turnover and social ascendancy only stabilize themselves strongly in the face of adversity and external events but seem to offer a greater diversity of opinions most of the time.

Keywords: community analysis, disinformation, misinformation, Twitter

Procedia PDF Downloads 140
25 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 150
24 Industrial Relations as Communication: The Strange Case of the FCA-UAW Agreement

Authors: Francesco Nespoli

Abstract:

After having posed a theoretical framework combining framing theory and new rhetoric, the paper analyze the shift in communication both adopted by UAW and FCA during the negotiations in fall 2015. The paper argues that mistakes and adjustments played a determinant role respectively in the rejection of the first tentative agreement and in the ratification of the contract. The purpose of the paper is to set a new theoretical framework for the analysis of communication in industrial relations, by describing a narrative construction of reality from the perspective of the new rhetoric. The paper thus analyze all public text, speeches, tweets and Facebook posts by the union reading them as part of the narrative set by the organization condensed by the slogan 'it’s our time'. That narrative tried to gain consensus from the members matching the expectations due to the industry recovery after more than five years of workers' sacrifices. In doing so, the analysis points out a shift in the communication strategy of the union after the first rejection of a tentative agreement in 15 years. The findings suggest that, from the communication point of view, consultation in industrial relations can be conceived as a particular kind of political communication where identification with the audience through deliberate narrative may not be effective if it is not preceded by a listening campaign.

Keywords: communication, consultation, automotive, FCA

Procedia PDF Downloads 188
23 Exploring Tweet Geolocation: Leveraging Large Language Models for Post-Hoc Explanations

Authors: Sarra Hasni, Sami Faiz

Abstract:

In recent years, location prediction on social networks has gained significant attention, with short and unstructured texts like tweets posing additional challenges. Advanced geolocation models have been proposed, increasing the need to explain their predictions. In this paper, we provide explanations for a geolocation black-box model using LIME and SHAP, two state-of-the-art XAI (eXplainable Artificial Intelligence) methods. We extend our evaluations to Large Language Models (LLMs) as post hoc explainers for tweet geolocation. Our preliminary results show that LLMs outperform LIME and SHAP by generating more accurate explanations. Additionally, we demonstrate that prompts with examples and meta-prompts containing phonetic spelling rules improve the interpretability of these models, even with informal input data. This approach highlights the potential of advanced prompt engineering techniques to enhance the effectiveness of black-box models in geolocation tasks on social networks.

Keywords: large language model, post hoc explainer, prompt engineering, local explanation, tweet geolocation

Procedia PDF Downloads 25
22 A Framework for Analyzing Public Interaction of Saudi Universities on Twitter

Authors: Sahar Al-Qahtani, Rabeeh Ayaz Abbasi, Naif Radi Aljohani

Abstract:

Many universities use social media platforms as new communication channels to disseminate information and promptly communicate with their audience. As Twitter is one of the widely used social media platforms, this research aims to explore the adaption and utilization of Twitter by universities. We propose a framework called 'Social Network Analysis for Universities on Twitter' (SNAUT) to analyze the usage of Twitter by universities and to measure their interaction with public. The study includes a sample of around 110,000 tweets from 36 Saudi universities, including both public and private universities. Using SNAUT, we can (1) investigate the purpose of using Twitter by universities, (2) determine the broad topics discussed by them, and (3) identify the groups closely associated with the universities. The results show that most of the Saudi universities (whether public or private) actively use Twitter. Results also reveal that public universities respond to public queries more frequently, but private universities stand out more in terms of information dissemination using retweets and diverse hashtags. Finally, we develop a ranking mechanism in SNAUT for ranking universities based on their social interaction with the public on Twitter.

Keywords: social media, twitter, social network analysis, universities, higher education, Saudi Arabia

Procedia PDF Downloads 136
21 A Study on Sentiment Analysis Using Various ML/NLP Models on Historical Data of Indian Leaders

Authors: Sarthak Deshpande, Akshay Patil, Pradip Pandhare, Nikhil Wankhede, Rushali Deshmukh

Abstract:

Among the highly significant duties for any language most effective is the sentiment analysis, which is also a key area of NLP, that recently made impressive strides. There are several models and datasets available for those tasks in popular and commonly used languages like English, Russian, and Spanish. While sentiment analysis research is performed extensively, however it is lagging behind for the regional languages having few resources such as Hindi, Marathi. Marathi is one of the languages that included in the Indian Constitution’s 8th schedule and is the third most widely spoken language in the country and primarily spoken in the Deccan region, which encompasses Maharashtra and Goa. There isn’t sufficient study on sentiment analysis methods based on Marathi text due to lack of available resources, information. Therefore, this project proposes the use of different ML/NLP models for the analysis of Marathi data from the comments below YouTube content, tweets or Instagram posts. We aim to achieve a short and precise analysis and summary of the related data using our dataset (Dates, names, root words) and lexicons to locate exact information.

Keywords: multilingual sentiment analysis, Marathi, natural language processing, text summarization, lexicon-based approaches

Procedia PDF Downloads 73
20 Towards an Adversary-Aware ML-Based Detector of Spam on Twitter Hashtags

Authors: Niddal Imam, Vassilios G. Vassilakis

Abstract:

After analysing messages posted by health-related spam campaigns in Twitter Arabic hashtags, we found that these campaigns use unique hijacked accounts (we call them adversarial hijacked accounts) as adversarial examples to fool deployed ML-based spam detectors. Existing ML-based models build a behaviour profile for each user to detect hijacked accounts. This approach is not applicable for detecting spam in Twitter hashtags since they are computationally expensive. Hence, we propose an adversary-aware ML-based detector, which includes a newly designed feature (avg posts) to improve the detection of spam tweets posted by the adversarial hijacked accounts at a tweet-level in trending hashtags. The proposed detector was designed considering three key points: robustness, adaptability, and interpretability. The new feature leverages the account’s temporal patterns (i.e., account age and number of posts). It is faster to compute compared to features discussed in the literature and improves the accuracy of detecting the identified hijacked accounts by 73%.

Keywords: Twitter spam detection, adversarial examples, evasion attack, adversarial concept drift, account hijacking, trending hashtag

Procedia PDF Downloads 78
19 Geovisualization of Human Mobility Patterns in Los Angeles Using Twitter Data

Authors: Linna Li

Abstract:

The capability to move around places is doubtless very important for individuals to maintain good health and social functions. People’s activities in space and time have long been a research topic in behavioral and socio-economic studies, particularly focusing on the highly dynamic urban environment. By analyzing groups of people who share similar activity patterns, many socio-economic and socio-demographic problems and their relationships with individual behavior preferences can be revealed. Los Angeles, known for its large population, ethnic diversity, cultural mixing, and entertainment industry, faces great transportation challenges such as traffic congestion, parking difficulties, and long commuting. Understanding people’s travel behavior and movement patterns in this metropolis sheds light on potential solutions to complex problems regarding urban mobility. This project visualizes people’s trajectories in Greater Los Angeles (L.A.) Area over a period of two months using Twitter data. A Python script was used to collect georeferenced tweets within the Greater L.A. Area including Ventura, San Bernardino, Riverside, Los Angeles, and Orange counties. Information associated with tweets includes text, time, location, and user ID. Information associated with users includes name, the number of followers, etc. Both aggregated and individual activity patterns are demonstrated using various geovisualization techniques. Locations of individual Twitter users were aggregated to create a surface of activity hot spots at different time instants using kernel density estimation, which shows the dynamic flow of people’s movement throughout the metropolis in a twenty-four-hour cycle. In the 3D geovisualization interface, the z-axis indicates time that covers 24 hours, and the x-y plane shows the geographic space of the city. Any two points on the z axis can be selected for displaying activity density surface within a particular time period. In addition, daily trajectories of Twitter users were created using space-time paths that show the continuous movement of individuals throughout the day. When a personal trajectory is overlaid on top of ancillary layers including land use and road networks in 3D visualization, the vivid representation of a realistic view of the urban environment boosts situational awareness of the map reader. A comparison of the same individual’s paths on different days shows some regular patterns on weekdays for some Twitter users, but for some other users, their daily trajectories are more irregular and sporadic. This research makes contributions in two major areas: geovisualization of spatial footprints to understand travel behavior using the big data approach and dynamic representation of activity space in the Greater Los Angeles Area. Unlike traditional travel surveys, social media (e.g., Twitter) provides an inexpensive way of data collection on spatio-temporal footprints. The visualization techniques used in this project are also valuable for analyzing other spatio-temporal data in the exploratory stage, thus leading to informed decisions about generating and testing hypotheses for further investigation. The next step of this research is to separate users into different groups based on gender/ethnic origin and compare their daily trajectory patterns.

Keywords: geovisualization, human mobility pattern, Los Angeles, social media

Procedia PDF Downloads 118
18 Using Bidirectional Encoder Representations from Transformers to Extract Topic-Independent Sentiment Features for Social Media Bot Detection

Authors: Maryam Heidari, James H. Jones Jr.

Abstract:

Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event or product. However, this use raises an important question: what percentage of information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a bot, instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. In this paper, we introduce a model for social media bot detection which uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features. Our use of a Natural Language Processing approach to derive topic-independent features for our new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data as generated by a bot or a human, where the most accurate prior work achieved accuracy of 92\%.

Keywords: bot detection, natural language processing, neural network, social media

Procedia PDF Downloads 116
17 Content Analysis of Images Shared on Twitter during 2017 Iranian Protests

Authors: Maryam Esfandiari, Bohdan Fridrich

Abstract:

On December 28, 2017, a wave of protests erupted in several Iranian cities. Protesters demonstrated against the president, Hasan Rohani, and theocratical nature of the regime. Iran has a recent history with protest movements, such as Green Movement responsible for demonstrations after 2009 Iranian presidential election. However, the 2017/2018 protests differ from the previous ones in terms of organization and agenda. The events show little to no central organization and seem as being sparked by grass root movements and by citizens’ fatigue of government corruption, authoritarianism, and economic problems of the country. Social media has played important role in communicating the protests to the outside world and also in general coordination. By using content analyses, this paper analyzes the visual content of Twitter posts published during the protests. It aims to find the correlation between their decentralized nature and nature of the tweets – either emotionally arousing or efficiency-elicit. Pictures are searched by hashtags and coded by their content, such as ‘crowds,’ ‘protest activities,’ ‘symbols of unity,’ ‘violence,’ ‘iconic figures,’ etc. The study determines what type of content prevails and what type is the most impactful in terms of reach. This study contributes to understanding the role of social media both as a tool and a space in protest organization and portrayal in countries with limited Internet access.

Keywords: twitter, Iran, collective action, protest

Procedia PDF Downloads 151
16 Charting Sentiments with Naive Bayes and Logistic Regression

Authors: Jummalla Aashrith, N. L. Shiva Sai, K. Bhavya Sri

Abstract:

The swift progress of web technology has not only amassed a vast reservoir of internet data but also triggered a substantial surge in data generation. The internet has metamorphosed into one of the dynamic hubs for online education, idea dissemination, as well as opinion-sharing. Notably, the widely utilized social networking platform Twitter is experiencing considerable expansion, providing users with the ability to share viewpoints, participate in discussions spanning diverse communities, and broadcast messages on a global scale. The upswing in online engagement has sparked a significant curiosity in subjective analysis, particularly when it comes to Twitter data. This research is committed to delving into sentiment analysis, focusing specifically on the realm of Twitter. It aims to offer valuable insights into deciphering information within tweets, where opinions manifest in a highly unstructured and diverse manner, spanning a spectrum from positivity to negativity, occasionally punctuated by neutrality expressions. Within this document, we offer a comprehensive exploration and comparative assessment of modern approaches to opinion mining. Employing a range of machine learning algorithms such as Naive Bayes and Logistic Regression, our investigation plunges into the domain of Twitter data streams. We delve into overarching challenges and applications inherent in the realm of subjectivity analysis over Twitter.

Keywords: machine learning, sentiment analysis, visualisation, python

Procedia PDF Downloads 56
15 Infodemic Detection on Social Media with a Multi-Dimensional Deep Learning Framework

Authors: Raymond Xu, Cindy Jingru Wang

Abstract:

Social media has become a globally connected and influencing platform. Social media data, such as tweets, can help predict the spread of pandemics and provide individuals and healthcare providers early warnings. Public psychological reactions and opinions can be efficiently monitored by AI models on the progression of dominant topics on Twitter. However, statistics show that as the coronavirus spreads, so does an infodemic of misinformation due to pandemic-related factors such as unemployment and lockdowns. Social media algorithms are often biased toward outrage by promoting content that people have an emotional reaction to and are likely to engage with. This can influence users’ attitudes and cause confusion. Therefore, social media is a double-edged sword. Combating fake news and biased content has become one of the essential tasks. This research analyzes the variety of methods used for fake news detection covering random forest, logistic regression, support vector machines, decision tree, naive Bayes, BoW, TF-IDF, LDA, CNN, RNN, LSTM, DeepFake, and hierarchical attention network. The performance of each method is analyzed. Based on these models’ achievements and limitations, a multi-dimensional AI framework is proposed to achieve higher accuracy in infodemic detection, especially pandemic-related news. The model is trained on contextual content, images, and news metadata.

Keywords: artificial intelligence, fake news detection, infodemic detection, image recognition, sentiment analysis

Procedia PDF Downloads 254
14 Opinion Mining to Extract Community Emotions on Covid-19 Immunization Possible Side Effects

Authors: Yahya Almurtadha, Mukhtar Ghaleb, Ahmed M. Shamsan Saleh

Abstract:

The world witnessed a fierce attack from the Covid-19 virus, which affected public life socially, economically, healthily and psychologically. The world's governments tried to confront the pandemic by imposing a number of precautionary measures such as general closure, curfews and social distancing. Scientists have also made strenuous efforts to develop an effective vaccine to train the immune system to develop antibodies to combat the virus, thus reducing its symptoms and limiting its spread. Artificial intelligence, along with researchers and medical authorities, has accelerated the vaccine development process through big data processing and simulation. On the other hand, one of the most important negatives of the impact of Covid 19 was the state of anxiety and fear due to the blowout of rumors through social media, which prompted governments to try to reassure the public with the available means. This study aims to proposed using Sentiment Analysis (AKA Opinion Mining) and deep learning as efficient artificial intelligence techniques to work on retrieving the tweets of the public from Twitter and then analyze it automatically to extract their opinions, expression and feelings, negatively or positively, about the symptoms they may feel after vaccination. Sentiment analysis is characterized by its ability to access what the public post in social media within a record time and at a lower cost than traditional means such as questionnaires and interviews, not to mention the accuracy of the information as it comes from what the public expresses voluntarily.

Keywords: deep learning, opinion mining, natural language processing, sentiment analysis

Procedia PDF Downloads 171