Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30835
Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Sung Ho Ha, Kyung Bae Park


Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: Text Mining, Visualization, latent dirichlet allocation, topic model, R program, user generated contents

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 915


[1] N. Naveed, T. Gottron, J. Kunegis, and A. C. Alhadi, "Bad news travel fast: A content-based analysis of interestingness on twitter," in Proceedings of the 3rd International Web Science Conference, 2011, p. 8.
[2] J. A. Chevalier and D. Mayzlin, "The effect of word of mouth on sales: Online book reviews," Journal of marketing research, vol. 43, pp. 345-354, 2006.
[3] N. Archak, A. Ghose, and P. G. Ipeirotis, "Deriving the pricing power of product features by mining consumer reviews," Management Science, vol. 57, pp. 1485-1509, 2011.
[4] B. Liu, Sentiment Analysis: Mining Opinions, Sentiments, and Emotions: Cambridge University Press, 2015.
[5] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," Journal of Machine Learning Research, vol. 3, pp. 993-1022, 2003.
[6] M. Ponweiser, “Latent Dirichlet Allocation in R,” Theses, Institute for Statistics and Mathematics WU Vienna University of Economics and Business, Vienna. 2012
[7] T. L. Griffiths and M. Steyvers, "Finding scientific topics," Proceedings of the National Academy of Sciences, vol. 101, pp. 5228-5235, 2004.
[8] D. M. Blei, "Probabilistic topic models," Communications of the ACM, vol. 55, pp. 77-84, 2012
[9] C. Sievert and K. E. Shirley, "LDAvis: A method for visualizing and interpreting topics," in Proceedings of the workshop on interactive language learning, visualization, and interfaces, 2014, pp. 63-70