Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques
Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1130529Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1067
 Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques. Proceedings of the ACL-02 conference on Empirical methods in natural language processing - EMNLP, pages 79–86, 2002.
 Peter D Turney. Thumbs up or thumbs down? Semantic Orientation applied to Unsupervised Classification of Reviews. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), (July):417–424, 2002.
 Andrew B Xiaojin. Introduction to Semi-Supervised Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, pages 1–130, 2009.
 Xiaowen Ding, Xiaowen Ding, Bing Liu, Bing Liu, Philip S. Yu, and Philip S. Yu. A holistic lexicon-based approach to opinion mining. Proceedings of the international conference on Web search and web data mining - WSDM, page 231, 2008.
 Kevin Hsin Yih Lin, Changhua Yang, and Hsin Hsi Chen. Emotion classification of online news articles from the reader’s perspective. Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008, pages 220–226, 2008.
 Jalel Akaichi. Social networks’ Facebook’ statutes updates mining for sentiment classification. Proceedings - SocialCom/PASSAT/BigData/EconCom/BioMedCom, pages 886–891, 2013.
 Hong Yu and Vasileios Hatzivassiloglou. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. Proceedings of the 2003 conference on Empirical methods in natural language processing, pages 129–136, 2003.
 Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao. Target-dependent Twitter Sentiment Classification. Computational Linguistics, pages 151–160, 2011.
 Xiaolong Wang, Furu Wei, Xiaohua Liu, Ming Zhou, and Ming Zhang. Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. Proceedings of the 20th ACM international conference on Information and knowledge management, pages 1031–1040, 2011.
 Mondher Bouazizi, Tomoaki Otsuki Ohtsuki, and Senior Member. A Pattern-Based Approach for Sarcasm Detection on Twitter. 2016.
 Mondher Bouazizi and Tomoaki Ohtsuki. Sarcasm detection in twitter: all your products are incredibly amazing - are they really? 2015 IEEE Global Communications Conference, GLOBECOM, pages 1–6, 2016.
 Ramanathan Narayanan, Bing Liu, and Alok Choudhary. Sentiment analysis of conditional sentences. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 1 EMNLP 09, (August):180, 2009.
 Nitin Jindal and Bing Liu. Identifying comparative sentences in text documents. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’06, page 244, 2006.
 Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics, pages 267–307, 2011.
 Zeynep Zengin Alp and Sule Gunduz Oduducu. Extracting Topical Information of Tweets Using Hashtags. IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pages 644–648, 2015.
 Brian Heredia, Taghi M. Khoshgoftaar, Joseph Prusa, and Michael Crawford. Cross-Domain Sentiment Analysis: An Empirical Investigation. IEEE 17th International Conference on Information Reuse and Integration (IRI), pages 160–165, 2016.
 Qingxi Peng and Ming Zhong. Detecting Spam Review through Sentiment Analysis. Journal of Software, pages 2065–2072, 2014.
 Hang Cui, Vibhu Mittal, and Mayur Datar. Comparative Experiments on Sentiment Classi cation for Online Product Reviews. Entropy, pages 1265–1270, 2003.
 Mochamad Ibrahim, Omar Abdillah, Alfan F. Wicaksono, and Mirna Adriani. Buzzer Detection and Sentiment Analysis for Predicting Presidential Election Results in a Twitter Nation. Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW, pages 1348–1353, 2016.
 S. A. A. A. Alrababah, K. H. Gan, and T. P. Tan. Product aspect ranking using sentiment analysis and topsis. Third International Conference on Information Retrieval and Knowledge Management (CAMP), pages 13–19, Aug 2016.
 M´ario Cordeiro. Twitter event detection: combining wavelet analysis and topic inference summarization. Proceedings of Doctoral Symposium on Informatics Engineering, 2012.
 Shravan Vishwanathan. Sentiment Analysis of French Movie Reviews. Proceedings of 3rd IRF International Conference, pages 80–82, 2014.
 Alexandra Balahur and Marco Turchi. Multilingual sentiment analysis using machine translation. Proceedings of the 3rd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, pages 5260, Jeju, Republic of Korea., (July):52–60, 2012.
 Xiaojun Wan. Using Bilingual Knowledge and Ensemble Techniques for Unsupervised Chinese Sentiment Analysis. Proceedings of the Conference on Empirical Methods, pages 553–561, 2008.
 Julian Brooke, Milan Tofiloski, and Maite Taboada. Cross-linguistic sentiment analysis: From english to spanish. International Conference RANLP, pages 50–54, 2009.
 Mahmoud Al-ayyoub, S. B. Essa, and Izzat Alsmadi. Lexicon-based sentiment analysis of Arabic tweets. International Journal of Social Network Mining, 2(July 2016):101–114, 2015.
 Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. ArXiv, pages 1–23, 2016.
 Ondej Bojar, Vojtˇech Diatka, Pavel Rychl´y, Pavel Stra´ak, V´ıt Suchomel, Aleˇs Tamchyna, and Daniel Zeman. Corpus for Machine Translation. pages 3550–3555, 2002.
 Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the ACL, 2005.
 Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to Sequence Learning with Neural Networks. Nips, pages 3104–3112, 2014.
 Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: A method for automatic evaluation of machine translation. pages 311–318, 2002.
 Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1724–1734, 2014.
 Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. Fast and Robust Neural Network Joint Models for Statistical Machine Translation. Acl, pages 1370–1380, 2014.
 Delta TFIDF: An Improved Feature Space for Sentiment Analysis. Proceedings of the Second International Conference on Weblogs and Social Media (ICWSM, (May):490–497, 2008.