Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33093
Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: Computational social science, movie preference, machine learning, SVM.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1315909

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649

References:


[1] John, G.: ‘Netflix Case Study: David Becomes Goliath’, in Editor: ‘Book Netflix Case Study: David Becomes Goliath’ (2008, September 13, edn.).
[2] Group, C.C.: ‘Distribution of movie and TV rental market revenue in the United States from 2012 to 2016’, in Editor: ‘Book Distribution of movie and TV rental market revenue in the United States from 2012 to 2016’ (2017, edn.).
[3] Group, D.E.: ‘Consumer spending on home entertainment rentals in the United States from 2012 to 2016, by type (in billion U.S. dollars).’, ‘Book Consumer spending on home entertainment rentals in the United States from 2012 to 2016, by type (in billion U.S. dollars).’ (2017, edn.).
[4] Booker, E.: ‘Predictive Analytics at Work at Ebay, Redbox.’, in Editor: ‘Book Predictive Analytics at Work at Ebay, Redbox.’ (2014, edn.).
[5] KNOWLEDGE@WHARTON: ‘How Data Analytics Is Shaping What You Watch’: ‘Book How Data Analytics Is Shaping What You Watch’ (2015, edn.).
[6] Briguez, C. E., Budán, M. C. D., Deagustini, C. A. D., Maguitman, A. G., Capobianco, M., and Simari, G. R.: ‘Argument-based mixed recommenders and their application to movie suggestion’, Expert Systems with Applications, 2014, 41, (14), pp. 6467-6482.
[7] Choi, S.-M., Ko, S.-K., and Han, Y.-S.: ‘A movie recommendation algorithm based on genre correlations’, Expert Systems with Applications, 2012, 39, (9), pp. 8079-8085.
[8] Bell, R.M., and Koren, Y.: ‘Lessons from the Netflix prize challenge’, SIGKDD Explor. Newsl., 2007, 9, (2), pp. 75-79.
[9] Billsus, D., and Pazzani, M. J.: ‘Learning Collaborative Information Filters’. Proc. Proceedings of the Fifteenth International Conference on Machine Learning1998.
[10] Sarwar, B., Karypis, G., Konstan, J., and Riedl, J.: ‘Analysis of recommendation algorithms for e-commerce’. Proc. Proceedings of the 2nd ACM conference on Electronic commerce, Minneapolis, Minnesota, USA2000.
[11] Adomavicius, G., and Tuzhilin, A.: ‘Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions’, IEEE Transactions on Knowledge and Data Engineering, 2005, 17, (6), pp. 734-749.
[12] Park, D. H., Kim, H. K., Choi, I. Y., and Kim, J. K.: ‘A literature review and classification of recommender systems research’, Expert Systems with Applications, 2012, 39, (11), pp. 10059-10072.
[13] Lops, P., De Gemmis, M., and Semeraro, G.: ‘Content-based recommender systems: State of the art and trends’: ‘Recommender systems handbook’ (Springer, 2011), pp. 73-105.
[14] Herlocker, J. L., Konstan, J. A., Borchers, A., and Riedl, J.: ‘An algorithmic framework for performing collaborative filtering’, in Editor (Ed.)^(Eds.): ‘Book An algorithmic framework for performing collaborative filtering’ (ACM, 1999, edn.), pp. 230-237.
[15] Sarwar, B., Karypis, G., Konstan, J., and Riedl, J.: ‘Item-based collaborative filtering recommendation algorithms’. Proc. Proceedings of the 10th international conference on World Wide Web, Hong Kong, Hong Kong2001 pp. Pages.
[16] Huang, Z., Chen, H., and Zeng, D.: ‘Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering’, ACM Trans. Inf. Syst., 2004, 22, (1), pp. 116-142.
[17] Popescul, A., Ungar, L. H., Pennock, D. M., and Lawrence, S.: ‘Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments’. Proc. Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence2001 pp. Pages
[18] Wilson, D.C., Smyth, B., and Sullivan, D.O.: ‘Sparsity Reduction in Collaborative Recommendation: A Case-Based Approach’, International Journal of Pattern Recognition and Artificial Intelligence, 2003, 17, (05), pp. 863-884.
[19] Ishikawa, M., Geczy, P., Izumi, N., Morita, T., and Yamaguchi, T.: ‘Information Diffusion Approach to Cold-Start Problem’. Proc. Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops2007.
[20] Schein, A. I., Popescul, A., Ungar, L. H., and Pennock, D. M.: ‘Methods and metrics for cold-start recommendations’. Proc. Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, Tampere, Finland 2002.
[21] Tang, T., and McCalla, G.: ‘Utilizing Artificial Learners to Help Overcome the Cold-Start Problem in a Pedagogically-Oriented Paper Recommendation System’, in De Bra, P.M.E., and Nejdl, W. (Eds.): ‘Adaptive Hypermedia and Adaptive Web-Based Systems: Third International Conference, AH 2004, Eindhoven, The Netherlands, August 23-26, 2004. Proceedings’ (Springer Berlin Heidelberg, 2004), pp. 245-254.
[22] Lew, M. S., Sebe, N., Djeraba, C., and Jain, R.: ‘Content-based multimedia information retrieval: State of the art and challenges’, ACM Trans. Multimedia Comput. Commun. Appl., 2006, 2, (1), pp. 1-19.
[23] Li, Y., Lu, L., and Xuefeng, L.: ‘A hybrid collaborative filtering method for multiple-interests and multiple-content recommendation in E-Commerce’, Expert Systems with Applications, 2005, 28, (1), pp. 67-77.
[24] Son, J., and Kim, S. B.: ‘Content-based filtering for recommendation systems using multiattribute networks’, Expert Syst. Appl., 2017, 89, (C), pp. 404-412.
[25] Noia, T. D., Mirizzi, R., Ostuni, V. C., Romito, D., and Zanker, M.: ‘Linked open data to support content-based recommender systems’. Proc. Proceedings of the 8th International Conference on Semantic Systems, Graz, Austria 2012.
[26] Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J.: ‘Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers’, Found. Trends Mach. Learn., 2011, 3, (1), pp. 1-122.
[27] Tibshirani, R.: ‘Regression shrinkage and selection via the lasso: a retrospective’, Journal of the Royal Statistical Society Series B, 2011, 73, (3), pp. 273-282.
[28] Zou, H., and Hastie, T.: ‘Regularization and Variable Selection via the Elastic Net’, Journal of the Royal Statistical Society. Series B (Statistical Methodology), 2005, 67, (2), pp. 301-320.
[29] Pereira, A .L. V., and Hruschka, E. R.: ‘Simultaneous co-clustering and learning to address the cold start problem in recommender systems’, Know.-Based Syst., 2015, 82, (C), pp. 11-19.
[30] Dreiseitl, S., and Ohno-Machado, L.: ‘Logistic regression and artificial neural network classification models: a methodology review’, Journal of Biomedical Informatics, 2002, 35, (5), pp. 352-359.
[31] Owen, A. B.: ‘Infinitely Imbalanced Logistic Regression’, J. Mach. Learn. Res., 2007, 8, pp. 761-773.
[32] Wang, Y., & Kosinski, M: ‘Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation From Facial Images’, Journal of Personality and Social Psychology, 2017, (in press).
[33] Wang, Y., & Kosinski, M: ‘Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation From Facial Images’, Journal of Personality and Social Psychology, 2017, (in press).