Search results for: Fake reviews

214 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: Fake news detection, types of fake news, machine learning, natural language processing, classification techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1514

213 A Framework for Review Spam Detection Research

Authors: Mohammadali Tavakoli, Atefeh Heydari, Zuriati Ismail, Naomie Salim

Abstract:

With the increasing number of people reviewing products online in recent years, opinion sharing websites has become the most important source of customers’ opinions. Unfortunately, spammers generate and post fake reviews in order to promote or demote brands and mislead potential customers. These are notably destructive not only for potential customers, but also for business holders and manufacturers. However, research in this area is not adequate, and many critical problems related to spam detection have not been solved to date. To provide green researchers in the domain with a great aid, in this paper, we have attempted to create a highquality framework to make a clear vision on review spam-detection methods. In addition, this report contains a comprehensive collection of detection metrics used in proposed spam-detection approaches. These metrics are extremely applicable for developing novel detection methods.

Keywords: Fake reviews, Feature collection, Opinion spam, Spam detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2517

212 Program Camouflage: A Systematic Instruction Hiding Method for Protecting Secrets

Authors: Yuichiro Kanzaki, Akito Monden, Masahide Nakamura, Ken-ichi Matsumoto

Abstract:

This paper proposes an easy-to-use instruction hiding method to protect software from malicious reverse engineering attacks. Given a source program (original) to be protected, the proposed method (1) takes its modified version (fake) as an input, (2) differences in assembly code instructions between original and fake are analyzed, and, (3) self-modification routines are introduced so that fake instructions become correct (i.e., original instructions) before they are executed and that they go back to fake ones after they are executed. The proposed method can add a certain amount of security to a program since the fake instructions in the resultant program confuse attackers and it requires significant effort to discover and remove all the fake instructions and self-modification routines. Also, this method is easy to use (with little effort) because all a user (who uses the proposed method) has to do is to prepare a fake source code by modifying the original source code.

Keywords: Copyright protection, program encryption, program obfuscation, self-modification, software protection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1508

211 Fake Account Detection in Twitter Based on Minimum Weighted Feature set

Authors: Ahmed El Azab, Amira M. Idrees, Mahmoud A. Mahmoud, Hesham Hefny

Abstract:

Social networking sites such as Twitter and Facebook attracts over 500 million users across the world, for those users, their social life, even their practical life, has become interrelated. Their interaction with social networking has affected their life forever. Accordingly, social networking sites have become among the main channels that are responsible for vast dissemination of different kinds of information during real time events. This popularity in Social networking has led to different problems including the possibility of exposing incorrect information to their users through fake accounts which results to the spread of malicious content during life events. This situation can result to a huge damage in the real world to the society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting the fake accounts on Twitter. The study determines the minimized set of the main factors that influence the detection of the fake accounts on Twitter, and then the determined factors are applied using different classification techniques. A comparison of the results of these techniques has been performed and the most accurate algorithm is selected according to the accuracy of the results. The study has been compared with different recent researches in the same area; this comparison has proved the accuracy of the proposed study. We claim that this study can be continuously applied on Twitter social network to automatically detect the fake accounts; moreover, the study can be applied on different social network sites such as Facebook with minor changes according to the nature of the social network which are discussed in this paper.

Keywords: Fake accounts detection, classification algorithms, twitter accounts analysis, features based techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5839

210 The Fake News Impact on the Public Policy Cycle: A Systemic Analysis through Documentary Survey

Authors: Aron Miranda Burgos, Ergon Cugler de Moraes Silva

Abstract:

In the present article, it is observed that the constant advancement of issues related to misinformation impacts the guarantee of the public policy cycle. Thus, it is found that the dissemination of false information has a direct influence on each of the component stages of this cycle. Therefore, in order to maintain scientific and theoretical credibility in the qualitative analysis process, it was necessary to logically interpose the concepts of firehosing of falsehood, fake news, public policy cycle, as well as using the epistemological and pragmatic mechanism at the intersection of such academic concepts, such as the scientific method. It was found, through the analysis of official documents and public notes, how the multiple theoretical perspectives evidence the commitment of the provision and elaboration of public policies, verifying the way in which the fake news impact each part of the process in this atmosphere.

Keywords: Firehosing of falsehood, governance, misinformation, post-truth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 869

209 Generative Adversarial Network Based Fingerprint Anti-Spoofing Limitations

Authors: Yehjune Heo

Abstract:

Fingerprint Anti-Spoofing approaches have been actively developed and applied in real-world applications. One of the main problems for Fingerprint Anti-Spoofing is not robust to unseen samples, especially in real-world scenarios. A possible solution will be to generate artificial, but realistic fingerprint samples and use them for training in order to achieve good generalization. This paper contains experimental and comparative results with currently popular GAN based methods and uses realistic synthesis of fingerprints in training in order to increase the performance. Among various GAN models, the most popular StyleGAN is used for the experiments. The CNN models were first trained with the dataset that did not contain generated fake images and the accuracy along with the mean average error rate were recorded. Then, the fake generated images (fake images of live fingerprints and fake images of spoof fingerprints) were each combined with the original images (real images of live fingerprints and real images of spoof fingerprints), and various CNN models were trained. The best performances for each CNN model, trained with the dataset of generated fake images and each time the accuracy and the mean average error rate, were recorded. We observe that current GAN based approaches need significant improvements for the Anti-Spoofing performance, although the overall quality of the synthesized fingerprints seems to be reasonable. We include the analysis of this performance degradation, especially with a small number of samples. In addition, we suggest several approaches towards improved generalization with a small number of samples, by focusing on what GAN based approaches should learn and should not learn.

Keywords: Anti-spoofing, CNN, fingerprint recognition, GAN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 594

208 Improving Fake News Detection Using K-means and Support Vector Machine Approaches

Authors: Kasra Majbouri Yazdi, Adel Majbouri Yazdi, Saeid Khodayi, Jingyu Hou, Wanlei Zhou, Saeed Saedy

Abstract:

Fake news and false information are big challenges of all types of media, especially social media. There is a lot of false information, fake likes, views and duplicated accounts as big social networks such as Facebook and Twitter admitted. Most information appearing on social media is doubtful and in some cases misleading. They need to be detected as soon as possible to avoid a negative impact on society. The dimensions of the fake news datasets are growing rapidly, so to obtain a better result of detecting false information with less computation time and complexity, the dimensions need to be reduced. One of the best techniques of reducing data size is using feature selection method. The aim of this technique is to choose a feature subset from the original set to improve the classification performance. In this paper, a feature selection method is proposed with the integration of K-means clustering and Support Vector Machine (SVM) approaches which work in four steps. First, the similarities between all features are calculated. Then, features are divided into several clusters. Next, the final feature set is selected from all clusters, and finally, fake news is classified based on the final feature subset using the SVM method. The proposed method was evaluated by comparing its performance with other state-of-the-art methods on several specific benchmark datasets and the outcome showed a better classification of false information for our work. The detection performance was improved in two aspects. On the one hand, the detection runtime process decreased, and on the other hand, the classification accuracy increased because of the elimination of redundant features and the reduction of datasets dimensions.

Keywords: Fake news detection, feature selection, support vector machine, K-means clustering, machine learning, social media.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4526

207 Automatic Extraction of Features and Opinion-Oriented Sentences from Customer Reviews

Authors: Khairullah Khan, Baharum B. Baharudin, Aurangzeb Khan, Fazal_e_Malik

Abstract:

Opinion extraction about products from customer reviews is becoming an interesting area of research. Customer reviews about products are nowadays available from blogs and review sites. Also tools are being developed for extraction of opinion from these reviews to help the user as well merchants to track the most suitable choice of product. Therefore efficient method and techniques are needed to extract opinions from review and blogs. As reviews of products mostly contains discussion about the features, functions and services, therefore, efficient techniques are required to extract user comments about the desired features, functions and services. In this paper we have proposed a novel idea to find features of product from user review in an efficient way. Our focus in this paper is to get the features and opinion-oriented words about products from text through auxiliary verbs (AV) {is, was, are, were, has, have, had}. From the results of our experiments we found that 82% of features and 85% of opinion-oriented sentences include AVs. Thus these AVs are good indicators of features and opinion orientation in customer reviews.

Keywords: Classification, Customer Reviews, Helping Verbs, Opinion Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2097

206 Protecting the Privacy and Trust of VIP Users on Social Network Sites

Authors: Nidal F. Shilbayeh, Sameh T. Khuffash, Mohammad H. Allymoun, Reem Al-Saidi

Abstract:

There is a real threat on the VIPs personal pages on the Social Network Sites (SNS). The real threats to these pages is violation of privacy and theft of identity through creating fake pages that exploit their names and pictures to attract the victims and spread of lies. In this paper, we propose a new secure architecture that improves the trusting and finds an effective solution to reduce fake pages and possibility of recognizing VIP pages on SNS. The proposed architecture works as a third party that is added to Facebook to provide the trust service to personal pages for VIPs. Through this mechanism, it works to ensure the real identity of the applicant through the electronic authentication of personal information by storing this information within content of their website. As a result, the significance of the proposed architecture is that it secures and provides trust to the VIPs personal pages. Furthermore, it can help to discover fake page, protect the privacy, reduce crimes of personality-theft, and increase the sense of trust and satisfaction by friends and admirers in interacting with SNS.

Keywords: Social Network Sites, Online Social Network, Privacy, Trust, Security and Authentication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3780

205 Identification of Most Frequently Occurring Lexis in Winnings-announcing Unsolicited Bulke-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of nearly 3000 winnings-announcing UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the winnings-announcing UBE documents. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexisset in the given UBE and the probability that the given UBE will be the one announcing fake winnings. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in winningsannouncing UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Lexis, Unsolicited Bulk e-mail (UBE), Vector SpaceDocument Model, Winnings, Lottery

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1540

204 Identification of Most Frequently Occurring Lexis in Body-enhancement Medicinal Unsolicited Bulk e-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of more than 2700 body enhancement medicinal UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the UBE documents that advertise various products for body enhancement. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexis-set in the given UBE and the probability that the given UBE will be the one advertising for fake medicinal product. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in such UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Body Enhancement, Lexis, Medicinal, Unsolicited Bulk e-mail (UBE), Vector Space Document Model, Viagra

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3509

203 Authenticity of Ecuadorian Commercial Honeys

Authors: Elisabetta Schievano, Valentina Zuccato, Claudia Finotello, Patricia Vit

Abstract:

Control of honey frauds is needed in Ecuador to protect bee keepers and consumers because simple syrups and new syrups with eucalyptus are sold as genuine honeys. Authenticity of Ecuadorian commercial honeys was tested with a vortex emulsion consisting on one volume of honey:water (1:1) dilution, and two volumes of diethyl ether. This method allows a separation of phases in one minute to discriminate genuine honeys that form three phase and fake honeys that form two phases; 34 of the 42 honeys analyzed from five provinces of Ecuador were genuine. This was confirmed with 1H NMR spectra of honey dilutions in deuterated water with an enhanced amino acid region with signals for proline, phenylalanine and tyrosine. Classic quality indicators were also tested with this method (sugars, HMF), indicators of fermentation (ethanol, acetic acid), and residues of citric acid used in the syrup manufacture. One of the honeys gave a false positive for genuine, being an admixture of genuine honey with added syrup, evident for the high sucrose. Sensory analysis was the final confirmation to recognize the honey groups studied here, namely honey produced in combs by Apis mellifera, fake honey, and honey produced in cerumen pots by Geotrigona, Melipona, and Scaptotrigona. Chloroform extractions of honey were also done to search lipophilic additives in NMR spectra. This is a valuable contribution to protect honey consumers, and to develop the beekeeping industry in Ecuador.

Keywords: Fake, genuine, honey, 1H NMR, Ecuador.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2687

202 Aliveness Detection of Fingerprints using Multiple Static Features

Authors: Heeseung Choi, Raechoong Kang, Kyungtaek Choi, Jaihie Kim

Abstract:

Fake finger submission attack is a major problem in fingerprint recognition systems. In this paper, we introduce an aliveness detection method based on multiple static features, which derived from a single fingerprint image. The static features are comprised of individual pore spacing, residual noise and several first order statistics. Specifically, correlation filter is adopted to address individual pore spacing. The multiple static features are useful to reflect the physiological and statistical characteristics of live and fake fingerprint. The classification can be made by calculating the liveness scores from each feature and fusing the scores through a classifier. In our dataset, we compare nine classifiers and the best classification rate at 85% is attained by using a Reduced Multivariate Polynomial classifier. Our approach is faster and more convenient for aliveness check for field applications.

Keywords: Aliveness detection, Fingerprint recognition, individual pore spacing, multiple static features, residual noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1926

201 Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models

Authors: Danielle Shackley, Yetunde Folajimi

Abstract:

As more people turn to the internet seeking health related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores of text, ranging from positive, neutral and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing, tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process, and substituting the Naive Bayes for a deep learning neural network model.

Keywords: Sentiment analysis, Naive Bayes model, natural language processing, topic analysis, fake health news classification model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 490

200 Understanding the Influence on Drivers’ Recommendation and Review-Writing Behavior in the P2P Taxi Service

Authors: Liwen Hou

Abstract:

The booming mobile business has been penetrating the taxi industry worldwide with P2P (peer to peer) taxi services, as an emerging business model, transforming the industry. Parallel with other mobile businesses, member recommendations and online reviews are believed to be very effective with regard to acquiring new users for P2P taxi services. Based on an empirical dataset of the taxi industry in China, this study aims to reveal which factors influence users’ recommendations and review-writing behaviors. Differing from the existing literature, this paper takes the taxi driver’s perspective into consideration and hence selects a group of variables related to the drivers. We built two models to reflect the factors that influence the number of recommendations and reviews posted on the platform (i.e., the app). Our models show that all factors, except the driver’s score, significantly influence the recommendation behavior. Likewise, only one factor, passengers’ bad reviews, is insignificant in generating more drivers’ reviews. In the conclusion, we summarize the findings and limitations of the research.

Keywords: Online recommendation, P2P taxi service, review-writing, word of mouth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1379

199 CRYPTO COPYCAT: A Fashion Centric Blockchain Framework for Eliminating Fashion Infringement

Authors: Magdi Elmessiry, Adel Elmessiry

Abstract:

The fashion industry represents a significant portion of the global gross domestic product, however, it is plagued by cheap imitators that infringe on the trademarks which destroys the fashion industry's hard work and investment. While eventually the copycats would be found and stopped, the damage has already been done, sales are missed and direct and indirect jobs are lost. The infringer thrives on two main facts: the time it takes to discover them and the lack of tracking technologies that can help the consumer distinguish them. Blockchain technology is a new emerging technology that provides a distributed encrypted immutable and fault resistant ledger. Blockchain presents a ripe technology to resolve the infringement epidemic facing the fashion industry. The significance of the study is that a new approach leveraging the state of the art blockchain technology coupled with artificial intelligence is used to create a framework addressing the fashion infringement problem. It transforms the current focus on legal enforcement, which is difficult at best, to consumer awareness that is far more effective. The framework, Crypto CopyCat, creates an immutable digital asset representing the actual product to empower the customer with a near real time query system. This combination emphasizes the consumer's awareness and appreciation of the product's authenticity, while provides real time feedback to the producer regarding the fake replicas. The main findings of this study are that implementing this approach can delay the fake product penetration of the original product market, thus allowing the original product the time to take advantage of the market. The shift in the fake adoption results in reduced returns, which impedes the copycat market and moves the emphasis to the original product innovation.

Keywords: Fashion, infringement, Blockchain, artificial intelligence, textiles supply.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1237

198 Vibration and Operation Technical Consideration before Field Balance of Gas Turbine Utilities (In Iran Power Plants SIEMENS V94.2 Gas Turbines)

Authors: Omid A. Zargar

Abstract:

One of the most challenging times in operation of big industrial plant or utilities is the time that alert lamp of Bently Nevada connection in main board substation turn on and show the alert condition of machine. All of the maintenance groups usually make a lot of discussion with operation and together rather this alert signal is real or fake. This will be more challenging when condition monitoring vibrationdata shows 1X(X=current rotor frequency) in fast Fourier transform(FFT) and vibration phase trends show 90 degree shift between two non-contact probedirections with overall high radial amplitude amounts. In such situations, CM (condition monitoring) groups usually suspicious about unbalance in rotor. In this paper, four critical case histories related to SIEMENS V94.2 Gas Turbines in Iran power industry discussed in details. Furthermore, probe looseness and fake (unreal) trip in gas turbine power plants discussed. In addition, critical operation decision in alert condition in power plants discussed in details.

Keywords: Gas turbine, field balance, turbine compressors, balancing tools, balancing data collectors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4123

197 Feature-Based Summarizing and Ranking from Customer Reviews

Authors: Dim En Nyaung, Thin Lai Lai Thein

Abstract:

Due to the rapid increase of Internet, web opinion sources dynamically emerge which is useful for both potential customers and product manufacturers for prediction and decision purposes. These are the user generated contents written in natural languages and are unstructured-free-texts scheme. Therefore, opinion mining techniques become popular to automatically process customer reviews for extracting product features and user opinions expressed over them. Since customer reviews may contain both opinionated and factual sentences, a supervised machine learning technique applies for subjectivity classification to improve the mining performance. In this paper, we dedicate our work is the task of opinion summarization. Therefore, product feature and opinion extraction is critical to opinion summarization, because its effectiveness significantly affects the identification of semantic relationships. The polarity and numeric score of all the features are determined by Senti-WordNet Lexicon. The problem of opinion summarization refers how to relate the opinion words with respect to a certain feature. Probabilistic based model of supervised learning will improve the result that is more flexible and effective.

Keywords: Opinion Mining, Opinion Summarization, Sentiment Analysis, Text Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2933

196 A Long Tail Study of eWOM Communities

Authors: M. Olmedilla, M. R. Martinez-Torres, S. L. Toral

Abstract:

Electronic Word-Of-Mouth (eWOM) communities represent today an important source of information in which more and more customers base their purchasing decisions. They include thousands of reviews concerning very different products and services posted by many individuals geographically distributed all over the world. Due to their massive audience, eWOM communities can help users to find the product they are looking for even if they are less popular or rare. This is known as the long tail effect, which leads to a larger number of lower-selling niche products. This paper analyzes the long tail effect in a well-known eWOM community and defines a tool for finding niche products unavailable through conventional channels.

Keywords: eWOM, Online user reviews, Long tail theory, Product categorization, Social Network Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2345

195 1/Sigma Term Weighting Scheme for Sentiment Analysis

Authors: Hanan Alshaher, Jinsheng Xu

Abstract:

Large amounts of data on the web can provide valuable information. For example, product reviews help business owners measure customer satisfaction. Sentiment analysis classifies texts into two polarities: positive and negative. This paper examines movie reviews and tweets using a new term weighting scheme, called one-over-sigma (1/sigma), on benchmark datasets for sentiment classification. The proposed method aims to improve the performance of sentiment classification. The results show that 1/sigma is more accurate than the popular term weighting schemes. In order to verify if the entropy reflects the discriminating power of terms, we report a comparison of entropy values for different term weighting schemes.

Keywords: Sentiment analysis, term weighting scheme, 1/sigma.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 536

194 Development of a Quantitative Material Wastage Management Plan for Effective Waste Reduction in the Building Construction Industry

Authors: Kwok Tak Kit

Abstract:

Combating climate change is becoming a hot topic in various sectors. Building construction and infrastructure sectors contributed a significant proportion of waste and greenhouse gas (GHG) emissions in the environment of different countries and cities. However, there is little research on the micro-level of waste management, “building construction material wastage management,” and fewer reviews about regulatory control in the building construction sector. This paper focuses on the potentialities and importance of material wastage management and reviews the deficiencies of the current standard to take into account the reduction of material wastage in a systematic and quantitative approach.

Keywords: Quantitative measurement, material wastage management plan, waste management, uncalculated waste, circular economy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 691

193 Detecting Fake News: A Natural Language Processing, Reinforcement Learning, and Blockchain Approach

Authors: Ashly Joseph, Jithu Paulose

Abstract:

In an era where misleading information may quickly circulate on digital news channels, it is crucial to have efficient and trustworthy methods to detect and reduce the impact of misinformation. This research proposes an innovative framework that combines Natural Language Processing (NLP), Reinforcement Learning (RL), and Blockchain technologies to precisely detect and minimize the spread of false information in news articles on social media. The framework starts by gathering a variety of news items from different social media sites and performing preprocessing on the data to ensure its quality and uniformity. NLP methods are utilized to extract complete linguistic and semantic characteristics, effectively capturing the subtleties and contextual aspects of the language used. These features are utilized as input for a RL model. This model acquires the most effective tactics for detecting and mitigating the impact of false material by modeling the intricate dynamics of user engagements and incentives on social media platforms. The integration of blockchain technology establishes a decentralized and transparent method for storing and verifying the accuracy of information. The Blockchain component guarantees the unchangeability and safety of verified news records, while encouraging user engagement for detecting and fighting false information through an incentive system based on tokens. The suggested framework seeks to provide a thorough and resilient solution to the problems presented by misinformation in social media articles.

Keywords: Natural Language Processing, Reinforcement Learning, Blockchain, fake news mitigation, misinformation detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 91

192 New Findings on the User’s Preferences about Data Visualization of Online Reviews

Authors: Elizabeth Simão Carvalho, Marcirio Silveira Chaves

Abstract:

The information visualization is still a knowledge field that lacks from a solid theory to support it and there is a myriad of existing methodologies and taxonomies that can be combined and adopted as guidelines. In this context, it is necessary to pre-evaluate as much as possible all the assumptions that are considered for its design and development. We present an exploratory study (n = 123) to detect the graphical preferences of travelers using accommodation portals of Web 2.0 (e.g. tripadvisor.com). We took into account some of the most relevant ground rules applied in the field to map visually data and design end-user interaction. Moreover, the evaluation process was completely data visualization oriented. We found out that people tend to refuse more advanced types of visualization and that a hybrid combination between radial graphs and stacked bars should be explored. In sum, this paper introduces new findings about the visual model and the cognitive response of users of accommodation booking websites.

Keywords: Information visualization, Data visualization, Visualization evaluation, Online reviews, Booking portal, Hotel booking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1720

191 The Importance of Project Post-Implementation Reviews

Authors: Catalin-Teodor Dogaru, Ana-Maria Dogaru

Abstract:

Success means different things for different people. For us, project managers, it becomes even harder to actually find a definition. Many factors have to be included in the evaluation. Moreover, literature is not very helpful, lacking consensus and neutrality. Post-implementation reviews (PIR) can be an efficient tool in evaluating how things worked on a certain project. Despite the visible progress, PIR is not a very detailed subject yet and there is not common understanding in this matter. This may be the reason that some organizations include it in the projects’ lifecycle and some do not. Through this paper, we point out the reasons why all project managers should pay proper attention to this important step and to the elements which can be assessed, beside the already famous triple constraints: cost, budget and time. It is essential to take notice that PIR is not a checklist. It brings the edge in eliminating subjectivity and judging projects based on actual proof. Based on our experience, our success indicator model, presented in this paper, contributes to the success of the project! In the same time, it increases trust among customers who will perceive success more objectively.

Keywords: Project, post-implementation, success, model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4921

190 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modeling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: Sentiment Analysis, Social Media, Twitter, Amazon, Data Mining, Machine Learning, Text Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3518

189 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of big data technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centres or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through VADER and RoBERTa model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and Term Frequency – Inverse Document Frequency (TFIDF) Vectorization and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide if the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: Counter vectorization, Convolutional Neural Network, Crawler, data technology, Long Short-Term Memory, LSTM, Web Scraping, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 177

188 A Systematic Literature Review on Changing Customer Requirements for Sustainable Design over Time

Authors: Lara F. Horani

Abstract:

Design is one of the most important stages in the process of product development. Product design has experienced significant changes over the years ranging from concentrating on cost and performance to combining economic, environmental and social considerations in customer requirements. Its evolution is in accordance with rapidly changing technology, economic situations, and climate change and environmental issues, as well as social context. Within product design, sustainability is a concept that balances economic, social and environmental aspects. This research aims to express changes in customer requirements over time from the viewpoint of sustainable design. It does so by systematically reviewing a broad scope of sustainable design literature. There is a need for a model to consider the changes that take place in customer requirements over time to build a successful relationship with customers which has been presented. Today’s literature does very little to even mention it, let alone present any progress in it. Systematic literature reviews are conducted primarily to: summarize the existing literature around a subject, highlight commonalities to build consensus, illuminate differences, identify gaps that can be filled, provide a background to position future research, and build a framework that can help designers meet the challenges of sustainable design.

Keywords: Sustainable design, customer requirements for sustainable design, changing customer requirements for sustainable design, systematic literature reviews.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1482

187 Managing Truck Drivers’ Fatigue: A Critical Review of the Literature and Recommended Remedies

Authors: Mozhgan Aliakbari, Sara Moridpour

Abstract:

In recent years, much attention has been given to truck drivers’ fatigue management. Long working hours negatively influence truck drivers’ physiology, health, and safety. However, there is little empirical research in the heavy vehicle transport sector in Australia to identify the influence of working hours’ management on drivers’ fatigue and consequently, on the risk of crashes and injuries. There is no national legislation regulating the number of hours or kilometres travelled by truck drivers. Consequently, it is almost impossible to define a standard number of hours or kilometres for truck drivers in a safety management system. This paper reviews the existing studies concerning safe system interventions such as tachographs in relation to fatigue caused by long working hours. This paper also reviews the literature to identify the influence of frequency of rest breaks on the reduction of work-related road transport accidents involving trucks. A framework is presented to manage truck drivers’ fatigue, which may result in the reduction of injuries and fatalities involving heavy vehicles.

Keywords: Fatigue, time management, trucks, traffic safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1452

186 Efficacy of Anti-phishing Measures and Strategies - A Research Analysis

Authors: Gundeep Singh Bindra

Abstract:

Statistics indicate that more than 1000 phishing attacks are launched every month. With 57 million people hit by the fraud so far in America alone, how do we combat phishing?This publication aims to discuss strategies in the war against Phishing. This study is an examination of the analysis and critique found in the ways adopted at various levels to counter the crescendo of phishing attacks and new techniques being adopted for the same. An analysis of the measures taken up by the varied popular Mail servers and popular browsers is done under this study. This work intends to increase the understanding and awareness of the internet user across the globe and even discusses plausible countermeasures at the users as well as the developers end. This conceptual paper will contribute to future research on similar topics.

Keywords: Anti-phishing, countermeasures, effectiveness, fake pages, security analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2768

185 Innovation and Technologies Synthesis of Various Components: A Contribution to the Precision Irrigation Development for Open-Field Fruit Orchards

Authors: P. Chatrabhuti, S. Visessri, T. Charinpanitkul

Abstract:

Precision irrigation (PI) technology has emerged as a solution to optimize water usage in agriculture, aiming to maximize crop yields while minimizing water waste. Developing a PI for commercialization requires developers to research, synthesize, evaluate, and select appropriate technologies and make use of such information to produce innovative products. The objective of this review is to facilitate innovators by providing them with a summary of existing knowledge and the identification of gaps in research linking to the innovative development of PI. This paper reviews and synthesizes technologies and components relevant to precision irrigation, highlighting its potential benefits and challenges. The Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) framework is used for the review. As a result of this review, the different technologies have limitations and may only be suitable for specific orchards or spatial settings. The current technologies are readily available in a range of options, from affordable controllers to high-performance systems that are both reliable and precise. Furthermore, the future prospects for incorporating artificial intelligence and machine learning techniques hold promise for advancing autonomous irrigation systems.

Keywords: Innovation synthesis, technology assessment, precision irrigation technologies, precision irrigation components, new product development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33