Search results for: sentiment analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 27321

Search results for: sentiment analysis

27171 Self-Supervised Learning for Hate-Speech Identification

Authors: Shrabani Ghosh

Abstract:

Automatic offensive language detection in social media has become a stirring task in today's NLP. Manual Offensive language detection is tedious and laborious work where automatic methods based on machine learning are only alternatives. Previous works have done sentiment analysis over social media in different ways such as supervised, semi-supervised, and unsupervised manner. Domain adaptation in a semi-supervised way has also been explored in NLP, where the source domain and the target domain are different. In domain adaptation, the source domain usually has a large amount of labeled data, while only a limited amount of labeled data is available in the target domain. Pretrained transformers like BERT, RoBERTa models are fine-tuned to perform text classification in an unsupervised manner to perform further pre-train masked language modeling (MLM) tasks. In previous work, hate speech detection has been explored in Gab.ai, which is a free speech platform described as a platform of extremist in varying degrees in online social media. In domain adaptation process, Twitter data is used as the source domain, and Gab data is used as the target domain. The performance of domain adaptation also depends on the cross-domain similarity. Different distance measure methods such as L2 distance, cosine distance, Maximum Mean Discrepancy (MMD), Fisher Linear Discriminant (FLD), and CORAL have been used to estimate domain similarity. Certainly, in-domain distances are small, and between-domain distances are expected to be large. The previous work finding shows that pretrain masked language model (MLM) fine-tuned with a mixture of posts of source and target domain gives higher accuracy. However, in-domain performance of the hate classifier on Twitter data accuracy is 71.78%, and out-of-domain performance of the hate classifier on Gab data goes down to 56.53%. Recently self-supervised learning got a lot of attention as it is more applicable when labeled data are scarce. Few works have already been explored to apply self-supervised learning on NLP tasks such as sentiment classification. Self-supervised language representation model ALBERTA focuses on modeling inter-sentence coherence and helps downstream tasks with multi-sentence inputs. Self-supervised attention learning approach shows better performance as it exploits extracted context word in the training process. In this work, a self-supervised attention mechanism has been proposed to detect hate speech on Gab.ai. This framework initially classifies the Gab dataset in an attention-based self-supervised manner. On the next step, a semi-supervised classifier trained on the combination of labeled data from the first step and unlabeled data. The performance of the proposed framework will be compared with the results described earlier and also with optimized outcomes obtained from different optimization techniques.

Keywords: attention learning, language model, offensive language detection, self-supervised learning

Procedia PDF Downloads 97
27170 Locket Application

Authors: Farah Al-Fityani, Aljohara Alsowail, Shatha Bindawood, Heba Balrbeah

Abstract:

Locket is a popular app that lets users share spontaneous photos with a close circle of friends. The app offers a unique way to stay connected with loved ones by allowing users to see glimpses of their day through photos displayed on a widget on their home screen. This summary outlines the process of developing an app like Locket, highlighting the importance of user privacy and security. It also details the findings of a study on user engagement with the Locket app, revealing positive sentiment towards its features and concept but also identifying areas for improvement. Overall, the summary portrays Locket as a successful app that is changing the way people connect on social media.

Keywords: locket, app, machine learning, connect

Procedia PDF Downloads 32
27169 Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Kyung Bae Park, Sung Ho Ha

Abstract:

Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: latent dirichlet allocation, R program, text mining, topic model, user generated contents, visualization

Procedia PDF Downloads 182
27168 Feature-Based Summarizing and Ranking from Customer Reviews

Authors: Dim En Nyaung, Thin Lai Lai Thein

Abstract:

Due to the rapid increase of Internet, web opinion sources dynamically emerge which is useful for both potential customers and product manufacturers for prediction and decision purposes. These are the user generated contents written in natural languages and are unstructured-free-texts scheme. Therefore, opinion mining techniques become popular to automatically process customer reviews for extracting product features and user opinions expressed over them. Since customer reviews may contain both opinionated and factual sentences, a supervised machine learning technique applies for subjectivity classification to improve the mining performance. In this paper, we dedicate our work is the task of opinion summarization. Therefore, product feature and opinion extraction is critical to opinion summarization, because its effectiveness significantly affects the identification of semantic relationships. The polarity and numeric score of all the features are determined by Senti-WordNet Lexicon. The problem of opinion summarization refers how to relate the opinion words with respect to a certain feature. Probabilistic based model of supervised learning will improve the result that is more flexible and effective.

Keywords: opinion mining, opinion summarization, sentiment analysis, text mining

Procedia PDF Downloads 321
27167 Product Features Extraction from Opinions According to Time

Authors: Kamal Amarouche, Houda Benbrahim, Ismail Kassou

Abstract:

Nowadays, e-commerce shopping websites have experienced noticeable growth. These websites have gained consumers’ trust. After purchasing a product, many consumers share comments where opinions are usually embedded about the given product. Research on the automatic management of opinions that gives suggestions to potential consumers and portrays an image of the product to manufactures has been growing recently. After launching the product in the market, the reviews generated around it do not usually contain helpful information or generic opinions about this product (e.g. telephone: great phone...); in the sense that the product is still in the launching phase in the market. Within time, the product becomes old. Therefore, consumers perceive the advantages/ disadvantages about each specific product feature. Therefore, they will generate comments that contain their sentiments about these features. In this paper, we present an unsupervised method to extract different product features hidden in the opinions which influence its purchase, and that combines Time Weighting (TW) which depends on the time opinions were expressed with Term Frequency-Inverse Document Frequency (TF-IDF). We conduct several experiments using two different datasets about cell phones and hotels. The results show the effectiveness of our automatic feature extraction, as well as its domain independent characteristic.

Keywords: opinion mining, product feature extraction, sentiment analysis, SentiWordNet

Procedia PDF Downloads 395
27166 Investigating the UAE Residential Valuation System: A Framework for Analysis

Authors: Simon Huston, Ebraheim Lahbash, Ali Parsa

Abstract:

The development of the United Arab Emirates (UAE) into a regional trade, tourism, finance and logistics hub has transformed its real estate markets. However, speculative activity and price volatility remain concerns. UAE residential market values (MV) are exposed to fluctuations in capital flows and migration which in turn are affected by geopolitical uncertainty, oil price volatility, and global investment market sentiment. Internally, a complex interplay between administrative boundaries, land tenure, building quality and evolving location characteristics fragments UAE residential property markets. In short, the UAE Residential Valuation System (UAE-RVS) confronts multiple challenges to collect, filter and analyze relevant information in complex and dynamic spatial and capital markets. A robust (RVS) can mitigate the risk of unhelpful volatility, speculative excess or investment mistakes. The research outlines the institutional, ontological, dynamic, and epistemological issues at play. We highlight the importance of system capabilities, valuation standard salience and stakeholders trust.

Keywords: valuation, property rights, information, institutions, trust, salience

Procedia PDF Downloads 367
27165 War Heritage: Different Perceptions of the Dominant Discourse among Visitors to the “Adem Jashari” Memorial Complex in Prekaz

Authors: Zana Llonçari Osmani, Nita Llonçari

Abstract:

In Kosovo, public rhetoric and popular sentiment position the War of 1998-99 (the war) as central to the formation of contemporary Kosovo's national identity. This period was marked by the forced massive displacement of Kosovo Albanians, the destruction of entire settlements, the loss of family members, and the profound emotional trauma experienced by civilians, particularly those who actively participated in the war as members of the Kosovo Liberation Army (KLA). Amidst these profound experiences, the Prekaz Massacre (The Massacre) is widely regarded as the defining event that preceded the final struggles of 1999 and the long-awaited attainment of independence. This study aims to explore how different visitors perceive the dominant discourse at The Memorial, a site dedicated to commemorating the Prekaz Massacre, and to identify the factors that influence their perceptions. The research employs a comprehensive mixed-method approach, combining online surveys, critical discourse analysis of visitor impressions, and content analysis of media representations. The findings of the study highlight the significant role played by original material remains in shaping visitor perceptions of The Memorial in comparison to the curated symbols and figurative representations interspersed throughout the landscape. While the design elements and physical layout of the memorial undeniably hold significance in conveying the memoryscape, there are notable shortcomings in enhancing the overall visitor experience. Visitors are still primarily influenced by the tangible remnants of the war, suggesting that there is room for improvement in how design elements can more effectively contribute to the memorial's narrative and the collective memory of the Prekaz Massacre.

Keywords: critical discourse analysis, memorialisation, national discourse, public rhetoric, war tourism

Procedia PDF Downloads 76
27164 Early Depression Detection for Young Adults with a Psychiatric and AI Interdisciplinary Multimodal Framework

Authors: Raymond Xu, Ashley Hua, Andrew Wang, Yuru Lin

Abstract:

During COVID-19, the depression rate has increased dramatically. Young adults are most vulnerable to the mental health effects of the pandemic. Lower-income families have a higher ratio to be diagnosed with depression than the general population, but less access to clinics. This research aims to achieve early depression detection at low cost, large scale, and high accuracy with an interdisciplinary approach by incorporating clinical practices defined by American Psychiatric Association (APA) as well as multimodal AI framework. The proposed approach detected the nine depression symptoms with Natural Language Processing sentiment analysis and a symptom-based Lexicon uniquely designed for young adults. The experiments were conducted on the multimedia survey results from adolescents and young adults and unbiased Twitter communications. The result was further aggregated with the facial emotional cues analyzed by the Convolutional Neural Network on the multimedia survey videos. Five experiments each conducted on 10k data entries reached consistent results with an average accuracy of 88.31%, higher than the existing natural language analysis models. This approach can reach 300+ million daily active Twitter users and is highly accessible by low-income populations to promote early depression detection to raise awareness in adolescents and young adults and reveal complementary cues to assist clinical depression diagnosis.

Keywords: artificial intelligence, COVID-19, depression detection, psychiatric disorder

Procedia PDF Downloads 122
27163 Polarity Classification of Social Media Comments in Turkish

Authors: Migena Ceyhan, Zeynep Orhan, Dimitrios Karras

Abstract:

People in modern societies are continuously sharing their experiences, emotions, and thoughts in different areas of life. The information reaches almost everyone in real-time and can have an important impact in shaping people’s way of living. This phenomenon is very well recognized and advantageously used by the market representatives, trying to earn the most from this means. Given the abundance of information, people and organizations are looking for efficient tools that filter the countless data into important information, ready to analyze. This paper is a modest contribution in this field, describing the process of automatically classifying social media comments in the Turkish language into positive or negative. Once data is gathered and preprocessed, feature sets of selected single words or groups of words are build according to the characteristics of language used in the texts. These features are used later to train, and test a system according to different machine learning algorithms (Naïve Bayes, Sequential Minimal Optimization, J48, and Bayesian Linear Regression). The resultant high accuracies can be important feedback for decision-makers to improve the business strategies accordingly.

Keywords: feature selection, machine learning, natural language processing, sentiment analysis, social media reviews

Procedia PDF Downloads 142
27162 HBTOnto: An Ontology Model for Analyzing Human Behavior Trajectories

Authors: Heba M. Wagih, Hoda M. O. Mokhtar

Abstract:

Social Network has recently played a significant role in both scientific and social communities. The growing adoption of social network applications has been a relevant source of information nowadays. Due to its popularity, several research trends are emerged to service the huge volume of users including, Location-Based Social Networks (LBSN), Recommendation Systems, Sentiment Analysis Applications, and many others. LBSNs applications are among the highly demanded applications that do not focus only on analyzing the spatiotemporal positions in a given raw trajectory but also on understanding the semantics behind the dynamics of the moving object. LBSNs are possible means of predicting human mobility based on users social ties as well as their spatial preferences. LBSNs rely on the efficient representation of users’ trajectories. Hence, traditional raw trajectory information is no longer convenient. In our research, we focus on studying human behavior trajectory which is the major pillar in location recommendation systems. In this paper, we propose an ontology design patterns with their underlying description logics to efficiently annotate human behavior trajectories.

Keywords: human behavior trajectory, location-based social network, ontology, social network

Procedia PDF Downloads 442
27161 Multi-Class Text Classification Using Ensembles of Classifiers

Authors: Syed Basit Ali Shah Bukhari, Yan Qiang, Saad Abdul Rauf, Syed Saqlaina Bukhari

Abstract:

Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining.

Keywords: Natural Language Processing, Ensemble Classifier, Bagging Classifier, AdaBoost

Procedia PDF Downloads 222
27160 Effects of Artificial Intelligence and Machine Learning on Social Media for Health Organizations

Authors: Ricky Leung

Abstract:

Artificial intelligence (AI) and machine learning (ML) have revolutionized the way health organizations approach social media. The sheer volume of data generated through social media can be overwhelming, but AI and ML can help organizations effectively manage this information to improve the health and well-being of individuals and communities. One way AI can be used to enhance social media in health organizations is through sentiment analysis. This involves analyzing the emotions expressed in social media posts to better understand public opinion and respond accordingly. This can help organizations gauge the impact of their campaigns, track the spread of misinformation, and improve communication with the public. While social media is a useful tool, researchers and practitioners have expressed fear that it will be used for the spread of misinformation, which can have serious consequences for public health. Health organizations must work to ensure that AI systems are transparent, trustworthy, and unbiased so they can help minimize the spread of misinformation. In conclusion, AI and ML have the potential to greatly enhance the use of social media in health organizations. These technologies can help organizations effectively manage large amounts of data and understand stakeholders' sentiments. However, it is important to carefully consider the potential consequences and ensure that these systems are carefully designed to minimize the spread of misinformation.

Keywords: AI, ML, social media, health organizations

Procedia PDF Downloads 80
27159 Text Mining of Veterinary Forums for Epidemiological Surveillance Supplementation

Authors: Samuel Munaf, Kevin Swingler, Franz Brülisauer, Anthony O’Hare, George Gunn, Aaron Reeves

Abstract:

Web scraping and text mining are popular computer science methods deployed by public health researchers to augment traditional epidemiological surveillance. However, within veterinary disease surveillance, such techniques are still in the early stages of development and have not yet been fully utilised. This study presents an exploration into the utility of incorporating internet-based data to better understand the smallholder farming communities within Scotland by using online text extraction and the subsequent mining of this data. Web scraping of the livestock fora was conducted in conjunction with text mining of the data in search of common themes, words, and topics found within the text. Results from bi-grams and topic modelling uncover four main topics of interest within the data pertaining to aspects of livestock husbandry: feeding, breeding, slaughter, and disposal. These topics were found amongst both the poultry and pig sub-forums. Topic modeling appears to be a useful method of unsupervised classification regarding this form of data, as it has produced clusters that relate to biosecurity and animal welfare. Internet data can be a very effective tool in aiding traditional veterinary surveillance methods, but the requirement for human validation of said data is crucial. This opens avenues of research via the incorporation of other dynamic social media data, namely Twitter and Facebook/Meta, in addition to time series analysis to highlight temporal patterns.

Keywords: veterinary epidemiology, disease surveillance, infodemiology, infoveillance, smallholding, social media, web scraping, sentiment analysis, geolocation, text mining, NLP

Procedia PDF Downloads 85
27158 From Shock to Self-Determination: Igbo Responses to the 1966 Pogrom and the Rise of Biafra Nationalism

Authors: Nnaemeka Enemchukwu

Abstract:

In modern-day Nigeria, the spirit of Biafra, the defunct secessionist state of former Eastern Nigeria, endures. While some attempt to downplay the historical factors that led to its creation, this paper aims to demonstrate that the 1966 pogroms in Nigeria, which claimed the lives of over 30,000 Igbo people, shattered their faith in the nation's ability to provide security and acceptance. This loss of faith led to a mass exodus from various regions of the country back to their homeland in Eastern Nigeria. Utilizing primary sources such as interviews and archival reports, and secondary sources like books, journals, and websites, this paper will argue that the trauma and terror of the 1966 massacres were the primary drivers of secessionist sentiment and self-determination among the Igbo people, ultimately leading to the declaration of Biafra. By drawing parallels with other historical incidents across the globe, this paper will establish the theoretical connection between shocking events, identity questioning among traumatized groups, and the subsequent rise of nationalistic sentiments seeking to ensure group preservation. To achieve its objective, this paper will employ descriptive, narrative, and chronological methods of analysis to present and discuss its findings.

Keywords: Igbo, pogrom, shock, trauma, nationalism, Biafra

Procedia PDF Downloads 60
27157 An Approach for Pattern Recognition and Prediction of Information Diffusion Model on Twitter

Authors: Amartya Hatua, Trung Nguyen, Andrew Sung

Abstract:

In this paper, we study the information diffusion process on Twitter as a multivariate time series problem. Our model concerns three measures (volume, network influence, and sentiment of tweets) based on 10 features, and we collected 27 million tweets to build our information diffusion time series dataset for analysis. Then, different time series clustering techniques with Dynamic Time Warping (DTW) distance were used to identify different patterns of information diffusion. Finally, we built the information diffusion prediction models for new hashtags which comprise two phrases: The first phrase is recognizing the pattern using k-NN with DTW distance; the second phrase is building the forecasting model using the traditional Autoregressive Integrated Moving Average (ARIMA) model and the non-linear recurrent neural network of Long Short-Term Memory (LSTM). Preliminary results of performance evaluation between different forecasting models show that LSTM with clustering information notably outperforms other models. Therefore, our approach can be applied in real-world applications to analyze and predict the information diffusion characteristics of selected topics or memes (hashtags) in Twitter.

Keywords: ARIMA, DTW, information diffusion, LSTM, RNN, time series clustering, time series forecasting, Twitter

Procedia PDF Downloads 385
27156 Emotion Detection in Twitter Messages Using Combination of Long Short-Term Memory and Convolutional Deep Neural Networks

Authors: Bahareh Golchin, Nooshin Riahi

Abstract:

One of the most significant issues as attended a lot in recent years is that of recognizing the sentiments and emotions in social media texts. The analysis of sentiments and emotions is intended to recognize the conceptual information such as the opinions, feelings, attitudes and emotions of people towards the products, services, organizations, people, topics, events and features in the written text. These indicate the greatness of the problem space. In the real world, businesses and organizations are always looking for tools to gather ideas, emotions, and directions of people about their products, services, or events related to their own. This article uses the Twitter social network, one of the most popular social networks with about 420 million active users, to extract data. Using this social network, users can share their information and opinions about personal issues, policies, products, events, etc. It can be used with appropriate classification of emotional states due to the availability of its data. In this study, supervised learning and deep neural network algorithms are used to classify the emotional states of Twitter users. The use of deep learning methods to increase the learning capacity of the model is an advantage due to the large amount of available data. Tweets collected on various topics are classified into four classes using a combination of two Bidirectional Long Short Term Memory network and a Convolutional network. The results obtained from this study with an average accuracy of 93%, show good results extracted from the proposed framework and improved accuracy compared to previous work.

Keywords: emotion classification, sentiment analysis, social networks, deep neural networks

Procedia PDF Downloads 131
27155 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis

Authors: Elcin Timur Cakmak, Ayse Oguzlar

Abstract:

This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.

Keywords: classification algorithms, machine learning, sentiment analysis, Twitter

Procedia PDF Downloads 66
27154 Mitigating the Unwillingness of e-Forums Members to Engage in Information Exchange

Authors: Dora Triki, Irena Vida, Claude Obadia

Abstract:

Social networks such as e-Forums or dating sites often face the reluctance of key members to participate. Relying on the conation theory, this study investigates this phenomenon and proposes solutions to mitigate the issue. We show that highly experienced e-Forum members refuse to share business information in a peer to peer information exchange forums. However, forums managers can mitigate this behavior by developing a sentiment of belongingness to the network. Furthermore, by selecting only elite forum participants with ample experience, they can reduce the reluctance of key information providers to engage in information exchange. Our hypotheses are tested with PLS structural equations modeling using survey data from members of a French e-Forum dedicated to the exchange of business information about exporting.

Keywords: conation, e-Forum, information exchange, members participation

Procedia PDF Downloads 148
27153 Extraction of Compound Words in Malay Sentences Using Linguistic and Statistical Approaches

Authors: Zamri Abu Bakar Zamri, Normaly Kamal Ismail Normaly, Mohd Izani Mohamed Rawi Izani

Abstract:

Malay noun compound are phrases that consist of two or more nouns. The key characteristic behind noun compounds lies on its frequent occurrences within the text. Therefore, extracting these noun compounds is essential for several domains of research such as Information Retrieval, Sentiment Analysis and Question Answering. Many research efforts have been proposed in terms of extracting Malay noun compounds using linguistic and statistical approaches. Most of the existing methods have concentrated on the extraction of bi-gram noun+noun compound. However, extracting noun+verb, noun+adjective and noun+prepositional is challenging due to the difficulty of selecting an appropriate method with effective results. Thus, there is still room for improvement in terms of enhancing the effectiveness of compound word extraction. Therefore, this study proposed a combination of linguistic approach and statistical measures in order to enhance the extraction of compound words. Several preprocessing steps are involved including normalization, tokenization, and stemming. The linguistic approach that has been used in this study is Part-of-Speech (POS) tagging. In addition, a new linguistic pattern for named entities has been utilized using a list of Malays named entities in order to enhance the linguistic approach in terms of noun compound recognition. The proposed statistical measures consists of NC-value, NTC-value and NLC value.

Keywords: Compound Word, Noun Compound, Linguistic Approach, Statistical Approach

Procedia PDF Downloads 338
27152 Recognizing Customer Preferences Using Review Documents: A Hybrid Text and Data Mining Approach

Authors: Oshin Anand, Atanu Rakshit

Abstract:

The vast increment in the e-commerce ventures makes this area a prominent research stream. Besides several quantified parameters, the textual content of reviews is a storehouse of many information that can educate companies and help them earn profit. This study is an attempt in this direction. The article attempts to categorize data based on a computed metric that quantifies the influencing capacity of reviews rendering two categories of high and low influential reviews. Further, each of these document is studied to conclude several product feature categories. Each of these categories along with the computed metric is converted to linguistic identifiers and are used in an association mining model. The article makes a novel attempt to combine feature attraction with quantified metric to categorize review text and finally provide frequent patterns that depict customer preferences. Frequent mentions in a highly influential score depict customer likes or preferred features in the product whereas prominent pattern in low influencing reviews highlights what is not important for customers. This is achieved using a hybrid approach of text mining for feature and term extraction, sentiment analysis, multicriteria decision-making technique and association mining model.

Keywords: association mining, customer preference, frequent pattern, online reviews, text mining

Procedia PDF Downloads 382
27151 Infodemic Detection on Social Media with a Multi-Dimensional Deep Learning Framework

Authors: Raymond Xu, Cindy Jingru Wang

Abstract:

Social media has become a globally connected and influencing platform. Social media data, such as tweets, can help predict the spread of pandemics and provide individuals and healthcare providers early warnings. Public psychological reactions and opinions can be efficiently monitored by AI models on the progression of dominant topics on Twitter. However, statistics show that as the coronavirus spreads, so does an infodemic of misinformation due to pandemic-related factors such as unemployment and lockdowns. Social media algorithms are often biased toward outrage by promoting content that people have an emotional reaction to and are likely to engage with. This can influence users’ attitudes and cause confusion. Therefore, social media is a double-edged sword. Combating fake news and biased content has become one of the essential tasks. This research analyzes the variety of methods used for fake news detection covering random forest, logistic regression, support vector machines, decision tree, naive Bayes, BoW, TF-IDF, LDA, CNN, RNN, LSTM, DeepFake, and hierarchical attention network. The performance of each method is analyzed. Based on these models’ achievements and limitations, a multi-dimensional AI framework is proposed to achieve higher accuracy in infodemic detection, especially pandemic-related news. The model is trained on contextual content, images, and news metadata.

Keywords: artificial intelligence, fake news detection, infodemic detection, image recognition, sentiment analysis

Procedia PDF Downloads 233
27150 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 119
27149 Evotrader: Bitcoin Trading Using Evolutionary Algorithms on Technical Analysis and Social Sentiment Data

Authors: Martin Pellon Consunji

Abstract:

Due to the rise in popularity of Bitcoin and other crypto assets as a store of wealth and speculative investment, there is an ever-growing demand for automated trading tools, such as bots, in order to gain an advantage over the market. Traditionally, trading in the stock market was done by professionals with years of training who understood patterns and exploited market opportunities in order to gain a profit. However, nowadays a larger portion of market participants are at minimum aided by market-data processing bots, which can generally generate more stable signals than the average human trader. The rise in trading bot usage can be accredited to the inherent advantages that bots have over humans in terms of processing large amounts of data, lack of emotions of fear or greed, and predicting market prices using past data and artificial intelligence, hence a growing number of approaches have been brought forward to tackle this task. However, the general limitation of these approaches can still be broken down to the fact that limited historical data doesn’t always determine the future, and that a lot of market participants are still human emotion-driven traders. Moreover, developing markets such as those of the cryptocurrency space have even less historical data to interpret than most other well-established markets. Due to this, some human traders have gone back to the tried-and-tested traditional technical analysis tools for exploiting market patterns and simplifying the broader spectrum of data that is involved in making market predictions. This paper proposes a method which uses neuro evolution techniques on both sentimental data and, the more traditionally human-consumed, technical analysis data in order to gain a more accurate forecast of future market behavior and account for the way both automated bots and human traders affect the market prices of Bitcoin and other cryptocurrencies. This study’s approach uses evolutionary algorithms to automatically develop increasingly improved populations of bots which, by using the latest inflows of market analysis and sentimental data, evolve to efficiently predict future market price movements. The effectiveness of the approach is validated by testing the system in a simulated historical trading scenario, a real Bitcoin market live trading scenario, and testing its robustness in other cryptocurrency and stock market scenarios. Experimental results during a 30-day period show that this method outperformed the buy and hold strategy by over 260% in terms of net profits, even when taking into consideration standard trading fees.

Keywords: neuro-evolution, Bitcoin, trading bots, artificial neural networks, technical analysis, evolutionary algorithms

Procedia PDF Downloads 113
27148 Convolutional Neural Networks-Optimized Text Recognition with Binary Embeddings for Arabic Expiry Date Recognition

Authors: Mohamed Lotfy, Ghada Soliman

Abstract:

Recognizing Arabic dot-matrix digits is a challenging problem due to the unique characteristics of dot-matrix fonts, such as irregular dot spacing and varying dot sizes. This paper presents an approach for recognizing Arabic digits printed in dot matrix format. The proposed model is based on Convolutional Neural Networks (CNN) that take the dot matrix as input and generate embeddings that are rounded to generate binary representations of the digits. The binary embeddings are then used to perform Optical Character Recognition (OCR) on the digit images. To overcome the challenge of the limited availability of dotted Arabic expiration date images, we developed a True Type Font (TTF) for generating synthetic images of Arabic dot-matrix characters. The model was trained on a synthetic dataset of 3287 images and 658 synthetic images for testing, representing realistic expiration dates from 2019 to 2027 in the format of yyyy/mm/dd. Our model achieved an accuracy of 98.94% on the expiry date recognition with Arabic dot matrix format using fewer parameters and less computational resources than traditional CNN-based models. By investigating and presenting our findings comprehensively, we aim to contribute substantially to the field of OCR and pave the way for advancements in Arabic dot-matrix character recognition. Our proposed approach is not limited to Arabic dot matrix digit recognition but can also be extended to text recognition tasks, such as text classification and sentiment analysis.

Keywords: computer vision, pattern recognition, optical character recognition, deep learning

Procedia PDF Downloads 76
27147 Pull String to Stop: Public Utility Vehicle Modernization Program

Authors: Frederick Kobe O. Obar, Preity B. Quinzon, Trisha B. Tumbokon, Mario Joshua D. Marron, Kenichi Katsuo Kichiro A. Rimorin

Abstract:

The Public Utility Vehicle Modernization Program (PUVMP) is a program meant to reform the current state of the Philippines’ public transportation sector. This study determined the impact of the Public Utility Vehicle Modernization Program on San Fernando City, La Union's jeepney drivers, interviewing six individuals, three with traditional vehicles and three with modernized units. This study used a descriptive qualitative research design and employed purposive sampling to select the six participants suited for the study, who were then subjected to a semi-structured face-to-face interview. The gathered data was then analyzed through thematic analysis. The findings highlighted evidence that the jeepney drivers experienced abrupt and prevailing changes in their routine and in their everyday work. This study concludes that while the sentiment of the program was appreciated, it has changed the environment for jeepney drivers drastically, provoking many reactions. These changes have, of course, shifted the daily lives of the jeepney drivers significantly, but through adaptability, they found ways. Recommendations include flexible compliance policies, educational initiatives, and support for drivers, providing valuable insights for informed decision-making in the ongoing transportation modernization discussion. This study concluded that while the drivers are not opposed to reform, they are not entirely in approval of the current effects of the program as it is being implemented in their local area.

Keywords: transport reform, transport modernization, public transport, jeepney drivers, PUVMP, urban planning, public utility vehicles

Procedia PDF Downloads 50
27146 Cryptocurrency Realities: Insights from Social and Economic Psychology

Authors: Sarah Marie

Abstract:

In today's dynamic financial landscape, cryptocurrencies represent a paradigm shift characterized by innovation and intense debate. This study probes into their transformative potential and the challenges they present, offering a balanced perspective that recognizes both their promise and pitfalls. Emulating the engaging style of a TED Talk, this research goes beyond academic analysis, serving as a critical bridge to reconcile the perspectives of cryptocurrency skeptics and enthusiasts, fostering a well-informed dialogue. The study employs a mixed-method approach, analyzing current trends, regulatory landscapes, and public perceptions in the cryptocurrency domain. It distinguishes genuine innovators in this field from ostentatious opportunists, echoing the sentiment that real innovation should be separated from mere showmanship. If one is unfamiliar with who is being referenced, they can likely spot them leaning against their Lamborghinis outside "Crypto" conventions, looking greasy. Major findings reveal a complex scenario dominated by regulatory uncertainties, market volatility, and security issues, emphasizing the need for a coherent regulatory framework that balances innovation with risk management and sustainable practices. The study underscores the importance of transparency and consumer protection in fostering responsible growth within the cryptocurrency ecosystem. In conclusion, the research advocates for education, innovation, and ethical governance in the realm of cryptocurrencies. It calls for collaborative efforts to navigate the intricacies of this evolving landscape and to realize its full potential in a responsible, inclusive, and forward-thinking manner.

Keywords: financial landscape, innovation, public perception, transparency

Procedia PDF Downloads 41
27145 The Emotional Experience of Urban Ruins and the Exploration of Urban Memory

Authors: Yan Jia China

Abstract:

The ruins is a kind of historical intention, which is also the current real existence of developing city. Zen culture of ancient China has a profound esthetic emotion, similarly, the west establish the concept of aesthetics of relic along with the Romanism’s (such as Rousseau etc.) sentiment to historical ruins at the end of 18th century. Nowadays, with the decline of traditional industrial society as well as the rise of post-industrial age, contemporary society must face the ruins and garbage problem which is left by industrial society. Commencing from the perspective of emotion and memory, this paper analyzes the importance for emotional needs as well as their existing status of several projects, such as the Capital Steelworks in Beijing (industrial devastation), the Shibati old section in Chongqing (urban slums) and the Old Hurva Synagogue in Jerusalem (ruins of war). It emphasizes urban design which is started from emotion and the sustainable development of city memory through managing the urban ruins which is criticized by people with the perspective of ecology and art.

Keywords: cultural heritage, urban ruins, ecology, emotion, sustainable urban memory

Procedia PDF Downloads 433
27144 Dirty Martini vs Martini: The Contrasting Duality Between Big Bang and BTS Public Image and Their Latest MVs Analysis

Authors: Patricia Portugal Marques de Carvalho Lourenco

Abstract:

Big Bang is like a dirty martini embroiled in a stew of personal individual scandals that have rocked the group’s image and perception, from G-Dragon’s and T.O.P. marijuana episodes in 2011 and 2016, respectively, to Daesung’s building illicit entertainment activities in 2018to the Burning Sun shebang that led to the Titanic sink of Big Bang’s youngest member Seungri in 2019 and the positive sentiment migration to the antithetical side. BTS, on the other hand, are like a martini, clear, clean, attracting as many crowds to their performances and online content as the Pope attracts believers to Sunday Mass in the Vatican, as exemplified by their latest MVs. Big Bang’s 2022 Still Life achieved 16.4 million views on Youtube in 24hours, whilst BTS Permission to Dance achieved 68.5 million in the same period of time. The difference is significant when added Big Bang’s and BTS overall award wins, a total of 117 in contrast to 460. Both groups are uniquely talented and exceptional performers that have been contributing greatly to the dissemination of Korean Pop Music on a global scale in their own inimitable ways. Both are exceptional in their own right and while the artists cannot, ought not, should not be compared for the grave injustice made in comparing one individual planet with one solar system, a contrast is merited and hence done. The reality, nonetheless, is about disengagement from a group that lives life humanly, learning and evolving with each challenge and mistake without a clean, perfect tag attached to it, demonstrating not only an inability to disassociate the person from the artist and the music but also an inability to understand the difference between a private and public life.

Keywords: K-Pop, big bang, BTS, music, public image, entertainment, korean entertainment

Procedia PDF Downloads 93
27143 A Comprehensive Framework for Fraud Prevention and Customer Feedback Classification in E-Commerce

Authors: Samhita Mummadi, Sree Divya Nagalli, Harshini Vemuri, Saketh Charan Nakka, Sumesh K. J.

Abstract:

One of the most significant challenges faced by people in today’s digital era is an alarming increase in fraudulent activities on online platforms. The fascination with online shopping to avoid long queues in shopping malls, the availability of a variety of products, and home delivery of goods have paved the way for a rapid increase in vast online shopping platforms. This has had a major impact on increasing fraudulent activities as well. This loop of online shopping and transactions has paved the way for fraudulent users to commit fraud. For instance, consider a store that orders thousands of products all at once, but what’s fishy about this is the massive number of items purchased and their transactions turning out to be fraud, leading to a huge loss for the seller. Considering scenarios like these underscores the urgent need to introduce machine learning approaches to combat fraud in online shopping. By leveraging robust algorithms, namely KNN, Decision Trees, and Random Forest, which are highly effective in generating accurate results, this research endeavors to discern patterns indicative of fraudulent behavior within transactional data. Introducing a comprehensive solution to this problem in order to empower e-commerce administrators in timely fraud detection and prevention is the primary motive and the main focus. In addition to that, sentiment analysis is harnessed in the model so that the e-commerce admin can tailor to the customer’s and consumer’s concerns, feedback, and comments, allowing the admin to improve the user’s experience. The ultimate objective of this study is to ramp up online shopping platforms against fraud and ensure a safer shopping experience. This paper underscores a model accuracy of 84%. All the findings and observations that were noted during our work lay the groundwork for future advancements in the development of more resilient and adaptive fraud detection systems, which will become crucial as technologies continue to evolve.

Keywords: behavior analysis, feature selection, Fraudulent pattern recognition, imbalanced classification, transactional anomalies

Procedia PDF Downloads 13
27142 Decoding Wallstreetbets: Daily Disagreements Among Retail Investors Echo in Trading Volumes

Authors: Farzaneh Ghandehari, Helen Lu, Lina El-Jahel, Dulani Jayasuriya

Abstract:

Disagreement among investors is a fundamental aspect of financial markets, significantly influencing market dynamics. Previous research highlights the challenges of effectively measuring investor disagreement, often relying on traditional proxies like analyst forecast dispersion, which are limited by biases and infrequent updates. Recent movements in social media indicate that retail investors actively seek financial advice online and can influence the stock market. The evolution of the investing landscape, particularly the rise of social media as a hub for financial advice, provides a novel avenue for real-time measurement of investor sentiment and disagreement. Platforms like Reddit offer rich, community-driven discussions that reflect genuine investor opinions. This research explores how social media empowers retail investors and the potential of leveraging textual analysis of social media content to capture daily fluctuations in investor disagreement. This study investigates the relationship between daily investor disagreement and trading volume, focusing on the role of social media platforms in shaping market dynamics, specifically using data from WallStreetBets (WSB) on Reddit. This paper uses data from 2020 to 2023 from WSB and analyses 4,896 firms with enough social media activity in WSB to define stock-day level disagreement measures. Consistent with traditional theories that disagreement induces trading volume, the results show significant evidence supporting this claim through different disagreement measures derived from WSB discussions.

Keywords: retail investor, social media, disagreement, social finance, reddit, fintech

Procedia PDF Downloads 18