Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 12

Search results for: Reddit

12 The Analysis of One Million Reddit Confessions Corpus: The Use of Emotive Verbs and First Person Singular Pronoun as Linguistic Psychotherapy Features

Abstract:

The paper aims to present the analysis of a Reddit confessions corpus. The interpretation focuses on the use of emotional language, in particular emotive verbs, in the context of personal pronouns. The analysis of the linguistic properties answers the question of what the Reddit users confess about and who is the subject of confessions. The study reveals that the specific language patterns used in Reddit confessions reflect the language of depression and the language used by patients during different stages of their psychotherapy sessions. The paper concludes that Reddit users are more willing to confess about their own experiences, not rarely very private and intimate, extensively using the first person singular pronoun I. It indicates that the Reddit users use the language of depression and the language used by psychotherapy patients. The language they use is very emotionally impacted and includes many emotive verbs such as want, feel, need, hate, love. This finding in Reddit confessions correlates with the extensive use of stative affective verbs in the first stages of the psychotherapy sessions. Lastly, the paper refers to the positive and negative lexicon and helps determine how online posts can serve as a depression detector and “talking cure” for the users.

Keywords: confessions, emotional language, emotive verbs, pronouns, first person pronoun, language of depression, depression detection, psychotherapy language

Procedia PDF Downloads 119

11 “We’ve Got to Get This Out of Our Game”: A Reddit Study on the Immediate Reaction to Homophobia in the United Soccer League

Authors: Steffanie Kiourkas

Abstract:

This study sought to examine fans’ responses to San Diego Loyal player Collin Martin being called a homophobic slur by Phoenix Rising player Junior Flemmings. Fans’ responses were gathered from Reddit comments on a thread posted immediately after the incident. Fans specifically centered their conversations on the homophobic slur, the coaches’ reactions, and the use of homophobia in sports in general. This study identified five themes present in the conversations between soccer fans. The themes generally highlighted critiques of the Phoenix Rising team and head coach, critiques of the use of homophobia in soccer, and support for the San Diego Loyal and hopefulness about the state of society moving forward.

Keywords: homophobia, LGBTQ+, United Soccer League, Reddit

Procedia PDF Downloads 87

10 Analyzing Speech Acts in Reddit Posts of Formerly Incarcerated Youths

Authors: Yusra Ibrahim

Abstract:

This study explores the online discourse of justice-involved youth on Reddit, focusing on how anonymity and asynchronicity influence their ability to share and reflect on their incarceration experiences within the "Ask Me Anything" (AMA) community. The study utilizes a quantitative analysis of speech acts to examine the varied communication patterns exhibited by youths and commenters across two AMA threads. The results indicate that, although Reddit is not specifically designed for formerly incarcerated youths, its features provide a supportive environment for them to share their incarceration experiences with non-incarcerated individuals. The level of empathy and support from the audience varies based on the audience’s perspectives on incarceration and related traumatic experiences. Additionally, the study identifies a reciprocal relationship where youths benefit from community support while offering insights into the juvenile justice system and helping the audience understand the experience of incarceration. The study also reveals cultural shocks in physical and digital environments that youth experience after release and when using social media platforms and the internet. The study has implications for juvenile justice personnel, policymakers, and researchers in the juvenile justice system.

Keywords: juvenile justice, online discourse, reddit AMA, anonymity, speech acts taxonomy, reintegration, online community support

Procedia PDF Downloads 41

9 Native Language Identification with Cross-Corpus Evaluation Using Social Media Data: ’Reddit’

Authors: Yasmeen Bassas, Sandra Kuebler, Allen Riddell

Abstract:

Native language identification is one of the growing subfields in natural language processing (NLP). The task of native language identification (NLI) is mainly concerned with predicting the native language of an author’s writing in a second language. In this paper, we investigate the performance of two types of features; content-based features vs. content independent features, when they are evaluated on a different corpus (using social media data “Reddit”). In this NLI task, the predefined models are trained on one corpus (TOEFL), and then the trained models are evaluated on different data using an external corpus (Reddit). Three classifiers are used in this task; the baseline, linear SVM, and logistic regression. Results show that content-based features are more accurate and robust than content independent ones when tested within the corpus and across corpus.

Keywords: NLI, NLP, content-based features, content independent features, social media corpus, ML

Procedia PDF Downloads 136

8 Early Stage Suicide Ideation Detection Using Supervised Machine Learning and Neural Network Classifier

Authors: Devendra Kr Tayal, Vrinda Gupta, Aastha Bansal, Khushi Singh, Sristi Sharma, Hunny Gaur

Abstract:

In today's world, suicide is a serious problem. In order to save lives, early suicide attempt detection and prevention should be addressed. A good number of at-risk people utilize social media platforms to talk about their issues or find knowledge on related chores. Twitter and Reddit are two of the most common platforms that are used for expressing oneself. Extensive research has already been done in this field. Through supervised classification techniques like Nave Bayes, Bernoulli Nave Bayes, and Multiple Layer Perceptron on a Reddit dataset, we demonstrate the early recognition of suicidal ideation. We also performed comparative analysis on these approaches and used accuracy, recall score, F1 score, and precision score for analysis.

Keywords: machine learning, suicide ideation detection, supervised classification, natural language processing

Procedia PDF Downloads 90

7 Linguistic Analysis of Borderline Personality Disorder: Using Language to Predict Maladaptive Thoughts and Behaviours

Authors: Charlotte Entwistle, Ryan Boyd

Abstract:

Recent developments in information retrieval techniques and natural language processing have allowed for greater exploration of psychological and social processes. Linguistic analysis methods for understanding behaviour have provided useful insights within the field of mental health. One area within mental health that has received little attention though, is borderline personality disorder (BPD). BPD is a common mental health disorder characterised by instability of interpersonal relationships, self-image and affect. It also manifests through maladaptive behaviours, such as impulsivity and self-harm. Examination of language patterns associated with BPD could allow for a greater understanding of the disorder and its links to maladaptive thoughts and behaviours. Language analysis methods could also be used in a predictive way, such as by identifying indicators of BPD or predicting maladaptive thoughts, emotions and behaviours. Additionally, associations that are uncovered between language and maladaptive thoughts and behaviours could then be applied at a more general level. This study explores linguistic characteristics of BPD, and their links to maladaptive thoughts and behaviours, through the analysis of social media data. Data were collected from a large corpus of posts from the publicly available social media platform Reddit, namely, from the ‘r/BPD’ subreddit whereby people identify as having BPD. Data were collected using the Python Reddit API Wrapper and included all users which had posted within the BPD subreddit. All posts were manually inspected to ensure that they were not posted by someone who clearly did not have BPD, such as people posting about a loved one with BPD. These users were then tracked across all other subreddits of which they had posted in and data from these subreddits were also collected. Additionally, data were collected from a random control group of Reddit users. Disorder-relevant behaviours, such as self-harming or aggression-related behaviours, outlined within Reddit posts were coded to by expert raters. All posts and comments were aggregated by user and split by subreddit. Language data were then analysed using the Linguistic Inquiry and Word Count (LIWC) 2015 software. LIWC is a text analysis program that identifies and categorises words based on linguistic and paralinguistic dimensions, psychological constructs and personal concern categories. Statistical analyses of linguistic features could then be conducted. Findings revealed distinct linguistic features associated with BPD, based on Reddit posts, which differentiated these users from a control group. Language patterns were also found to be associated with the occurrence of maladaptive thoughts and behaviours. Thus, this study demonstrates that there are indeed linguistic markers of BPD present on social media. It also implies that language could be predictive of maladaptive thoughts and behaviours associated with BPD. These findings are of importance as they suggest potential for clinical interventions to be provided based on the language of people with BPD to try to reduce the likelihood of maladaptive thoughts and behaviours occurring. For example, by social media tracking or engaging people with BPD in expressive writing therapy. Overall, this study has provided a greater understanding of the disorder and how it manifests through language and behaviour.

Keywords: behaviour analysis, borderline personality disorder, natural language processing, social media data

Procedia PDF Downloads 349

6 Decoding WallStreetBets: The Impact of Daily Disagreements on Trading Volumes

Authors: F. Ghandehari, H. Lu, L. El-Jahel, D. Jayasuriya

Abstract:

Disagreement among investors is a fundamental aspect of financial markets, significantly influencing market dynamics. Measuring this disagreement has traditionally posed challenges, often relying on proxies like analyst forecast dispersion, which are limited by biases and infrequent updates. Recent movements in social media indicate that retail investors actively seek financial advice online and can influence the stock market. The evolution of the investing landscape, particularly the rise of social media as a hub for financial advice, provides an alternative avenue for real-time measurement of investor sentiment and disagreement. Platforms like Reddit offer rich, community-driven discussions that reflect genuine investor opinions. This research explores how social media empowers retail investors and the potential of leveraging textual analysis of social media content to capture daily fluctuations in investor disagreement. This study investigates the relationship between daily investor disagreement and trading volume, focusing on the role of social media platforms in shaping market dynamics, specifically using data from WallStreetBets (WSB) on Reddit. This paper uses data from 2020 to 2023 from WSB and analyses 4,896 firms with enough social media activity in WSB to define stock-day level disagreement measures. Consistent with traditional theories that disagreement induces trading volume, the results show significant evidence supporting this claim through different disagreement measures derived from WSB discussions.

Keywords: disagreement, retail investor, social finance, social media

Procedia PDF Downloads 38

5 Natural Language Processing for the Classification of Social Media Posts in Post-Disaster Management

Authors: Ezgi Şendil

Abstract:

Information extracted from social media has received great attention since it has become an effective alternative for collecting people’s opinions and emotions based on specific experiences in a faster and easier way. The paper aims to put data in a meaningful way to analyze users’ posts and get a result in terms of the experiences and opinions of the users during and after natural disasters. The posts collected from Reddit are classified into nine different categories, including injured/dead people, infrastructure and utility damage, missing/found people, donation needs/offers, caution/advice, and emotional support, identified by using labelled Twitter data and four different machine learning (ML) classifiers.

Keywords: disaster, NLP, postdisaster management, sentiment analysis

Procedia PDF Downloads 75

4 Artificial Intelligence Based Meme Generation Technology for Engaging Audience in Social Media

Authors: Andrew Kurochkin, Kostiantyn Bokhan

Abstract:

In this study, a new meme dataset of ~650K meme instances was created, a technology of meme generation based on the state of the art deep learning technique - GPT-2 model was researched, a comparative analysis of machine-generated memes and human-created was conducted. We justified that Amazon Mechanical Turk workers can be used for the approximate estimating of users' behavior in a social network, more precisely to measure engagement. It was shown that generated memes cause the same engagement as human memes that produced low engagement in the social network (historically). Thus, generated memes are less engaging than random memes created by humans.

Keywords: content generation, computational social science, memes generation, Reddit, social networks, social media interaction

Procedia PDF Downloads 137

3 Tweets to Touchdowns: Predicting National Football League Achievement from Social Media Optimism

Authors: Rohan Erasala, Ian McCulloh

Abstract:

The NFL Draft is a chance for every NFL team to select their next superstar. As a result, teams heavily invest in scouting, and millions of fans partake in the online discourse surrounding the draft. This paper investigates the potential correlations between positive sentiment in individual draft selection threads from the subreddit r/NFL and if this data can be used to make successful player recommendations. It is hypothesized that there will be limited correlations and nonviable recommendations made from these threads. The hypothesis is tested using sentiment analysis of draft thread comments and analyzing correlation and precision at k of top scores. The results indicate weak correlations between the percentage of positive comments in a draft selection thread and a player’s approximate value, but potentially viable recommendations from looking at players whose draft selection threads have the highest percentage of positive comments.

Keywords: national football league, NFL, NFL Draft, sentiment analysis, Reddit, social media, NLP

Procedia PDF Downloads 84

2 Social Media Consumption Habits within the Millennial Generation: A Comparison between U.S. And Bangladesh

Authors: Didarul Islam Manik

Abstract:

The study was conducted to determine social media usage by the Millennial/young-adult generation in the U.S. and Bangladesh. It investigated what types of social media Millennials/young-adults use in their everyday lives; for what purpose they use social media; what are the significant differences between the two cultures in terms of social media use; and how the age of the respondents correlates with differences in social media use. Among the 409 respondents, 200 were selected from the University of South Dakota and 209 from the University of Dhaka, Bangladesh. The convenience sampling method was used to select the samples. A four-page questionnaire instrument was constructed with 19 closed-ended questions that collected 87 data points. The study considered the uses and gratifications and domestication of technology models as theoretical frameworks. The study found that the Millennials spend an average of 4.5 hours on the Internet daily. They spend an average of 134 minutes on social media every day. However, the U.S. Millennials spend more time (141 minutes) on social media than the Bangladeshis (127 minutes). The U.S. Millennials use various types of social media including Facebook, Twitter, YouTube, Instagram, Pinterest, SnapChat, Reddit, Imgur, etc. In contrast, Bangladeshis use Facebook, YouTube, and Google plus+. The Bangladeshis tended to spend more time on Facebook (107 minutes) than the Americans (57 minutes). The study found that the Millennials of the two countries use Facebook to fill their free time, acquire information, seek entertainment, and maintain existing relationships. However, Bangladeshis are more likely to use Facebook for the acquisition of information, entertainment, educational purposes, and connecting with the people closest to them. Millennials also use Twitter to fill their free time, acquire information, and for entertainment. The study found a statistically significant difference between female and male social media use. It also found a significant correlation between age and using Facebook for educational purposes; age and discussing and posting religious issues; and age and meeting with new people. There is also a correlation between age and the use of Twitter for spending time and seeking entertainment.

Keywords: American study, social media, millennial generation, South Asian studies

Procedia PDF Downloads 233

1 PsyVBot: Chatbot for Accurate Depression Diagnosis using Long Short-Term Memory and NLP

Authors: Thaveesha Dheerasekera, Dileeka Sandamali Alwis

Abstract:

The escalating prevalence of mental health issues, such as depression and suicidal ideation, is a matter of significant global concern. It is plausible that a variety of factors, such as life events, social isolation, and preexisting physiological or psychological health conditions, could instigate or exacerbate these conditions. Traditional approaches to diagnosing depression entail a considerable amount of time and necessitate the involvement of adept practitioners. This underscores the necessity for automated systems capable of promptly detecting and diagnosing symptoms of depression. The PsyVBot system employs sophisticated natural language processing and machine learning methodologies, including the use of the NLTK toolkit for dataset preprocessing and the utilization of a Long Short-Term Memory (LSTM) model. The PsyVBot exhibits a remarkable ability to diagnose depression with a 94% accuracy rate through the analysis of user input. Consequently, this resource proves to be efficacious for individuals, particularly those enrolled in academic institutions, who may encounter challenges pertaining to their psychological well-being. The PsyVBot employs a Long Short-Term Memory (LSTM) model that comprises a total of three layers, namely an embedding layer, an LSTM layer, and a dense layer. The stratification of these layers facilitates a precise examination of linguistic patterns that are associated with the condition of depression. The PsyVBot has the capability to accurately assess an individual's level of depression through the identification of linguistic and contextual cues. The task is achieved via a rigorous training regimen, which is executed by utilizing a dataset comprising information sourced from the subreddit r/SuicideWatch. The diverse data present in the dataset ensures precise and delicate identification of symptoms linked with depression, thereby guaranteeing accuracy. PsyVBot not only possesses diagnostic capabilities but also enhances the user experience through the utilization of audio outputs. This feature enables users to engage in more captivating and interactive interactions. The PsyVBot platform offers individuals the opportunity to conveniently diagnose mental health challenges through a confidential and user-friendly interface. Regarding the advancement of PsyVBot, maintaining user confidentiality and upholding ethical principles are of paramount significance. It is imperative to note that diligent efforts are undertaken to adhere to ethical standards, thereby safeguarding the confidentiality of user information and ensuring its security. Moreover, the chatbot fosters a conducive atmosphere that is supportive and compassionate, thereby promoting psychological welfare. In brief, PsyVBot is an automated conversational agent that utilizes an LSTM model to assess the level of depression in accordance with the input provided by the user. The demonstrated accuracy rate of 94% serves as a promising indication of the potential efficacy of employing natural language processing and machine learning techniques in tackling challenges associated with mental health. The reliability of PsyVBot is further improved by the fact that it makes use of the Reddit dataset and incorporates Natural Language Toolkit (NLTK) for preprocessing. PsyVBot represents a pioneering and user-centric solution that furnishes an easily accessible and confidential medium for seeking assistance. The present platform is offered as a modality to tackle the pervasive issue of depression and the contemplation of suicide.

Keywords: chatbot, depression diagnosis, LSTM model, natural language process

Procedia PDF Downloads 68