Enhance the Power of Sentiment Analysis
Authors: Yu Zhang, Pedro Desouza
Abstract:
Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modeling and testing work was done in R and Greenplum in-database analytic tools.
Keywords: Sentiment Analysis, Social Media, Twitter, Amazon, Data Mining, Machine Learning, Text Mining.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1091156
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3516References:
[1] IBM. IBM Research - Tokyo / Text Mining. Available from: http://www.trl.ibm.com/projects/textmining/takmi/sentiment_analysis_e.htm.
[2] Alec Go, R.B., Lei Huang, Twitter Sentiment Classification Using Distant Surpervision. 2009, Stanford University.
[3] twitrratr. Available from: http://twitrratr.com/.
[4] Alexander Pak, P.P., Twitter as a Corpus for Sentiment Analysis and Opinion Mining. IREC 2010, Seventh International Conference on Language Resources and Evaluation 2010.
[5] Rudy Prabowo, M.T., Sentiment Analysis: A Combined Approach. Journal of Informetrics, 2009. 3(2): p. 143-157.
[6] Bo Pang, L.L., Thumbs up? Sentiment Classification using Machine Learning Techniques. Proceedings of EMNLP, 2002: p. PP. 79-86.
[7] Bo Pang, L.L., Opinion Mining and Sentiment Analysis. Now Publishers Inc, 2008.
[8] Bo Pang, L.L., Movie Review Data. Cornell University.
[9] Ogneva, M., How Companies Can Use Sentiment Analysis to Improve Their Business. Mashable, 2010.
[10] Bing Liu, M.H., A list of positive and negative opinion words or sentiment words for English. UIC.
[11] Bricker, E., Can Social Media Measure Customer Satisfaction? 2011, NetBase Solutions Inc.
[12] Soo-min Kim, E.H., Automatic Identification of pro and con Reasons in online reviews. Proceedings of COLING/ACL, 2006.
[13] A Berger, S.D.P.a.V.D.P., A Maximum entropy approach to natural language processing. Computational Linguistics, 1996. 22(1).
[14] Joachims, T., Text categorization with support vector machines: learning with many relevant features. Proceedings of the European Conference on Machine Learning (ECML), 1998: p. 137-142.
[15] Burges, C.J.C., A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998. 2: p. 121-167.
[16] Bo Pang, L.L., A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based On Minimum Cuts. Proceedings of the ACL, 2004: p. 271-278.
[17] Bo Pang, L.L., Seeing stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales. Proceedings of ACL, 2005.
[18] Song, A.D.a.F., Feature Selection for Sentiment Analysis Based on Content and Syntax Models. Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, 2011. ACL-HLT 2011: p. 96-103.
[19] Ishwinder Kaur, A.J.H., A Comparison of LSA, WordNet and PMI-IR for Predicting User Click Behavior. CHI, 2005.