Unsupervised Text Mining Approach to Early Warning System
Authors: Ichihan Tai, Bill Olson, Paul Blessner
Abstract:
Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.
Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1124099
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924References:
[1] Weber, R., Aha, D. W., & Becerra-Fernandez, I. (2001). Intelligent lessons learned systems. Expert systems with applications, 20(1), 17-34.
[2] Ahn, J. J., Oh, K. J., Kim, T. Y., & Kim, D. H. (2011). Usefulness of support vector machine to develop an early warning system for financial crisis. Expert Systems with Applications, 38(4), 2966-2973.
[3] Kim, T. Y., Oh, K. J., Sohn, I., & Hwang, C. (2004). Usefulness of artificial neural networks for early warning system of economic crisis. Expert Systems with Applications, 26(4), 583-590.
[4] Oh, K. J., Kim, T. Y., & Kim, C. (2006). An early warning system for detection of financial crisis using financial market volatility. Expert Systems, 23(2), 83-98.
[5] Exchange, C. B. O. (2009). The CBOE volatility index-VIX. White Paper, 1-23.
[6] Nassirtoussi, A. K., Aghabozorgi, S., Wah, T. Y., & Ngo, D. C. L. (2014). Text mining for market prediction: A systematic review. Expert Systems with Applications, 41(16), 7653-7670.
[7] Onsumran, C., Thammaboosadee, S., & Kiattisin, S. (2015). Gold Price Volatility Prediction by Text Mining in Economic Indicators News. Journal of Advances in Information Technology Vol, 6(4).
[8] Nassirtoussi, A. K., Aghabozorgi, S., Wah, T. Y., & Ngo, D. C. L. (2015). Text mining of news-headlines for FOREX market prediction: A Multi-Layer Dimension Reduction Algorithm with semantics and sentiment. Expert Systems with Applications, 42(1), 306-324.
[9] Huang, C. J., Liao, J. J., Yang, D. X., Chang, T. Y., & Luo, Y. C. (2010). Realization of a news dissemination agent based on weighted association rules and text mining techniques. Expert Systems with Applications, 37(9), 6409-6413.
[10] Cheung, C. F., Lee, W. B., Wang, W. M., Wang, Y., & Yeung, W. M. (2011). A multi-faceted and automatic knowledge elicitation system (MAKES) for managing unstructured information. Expert Systems with Applications, 38(5), 5245-5258.
[11] Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391.
[12] Strait, M. J., Haynes, J. A., & Foltz, P. W. (2000, July). Applications of latent semantic analysis to lessons learned systems. In Intelligent Lessons Learned Systems: Papers from the AAAI Workshop (pp. 51-53). AAAI, Menlo Park, CA.
[13] Hollum, A. T. G., Mosch, B. P., & Szlávik, Z. (2013). Economic sentiment: Text-based prediction of stock price movements with machine learning and wordnet. In Recent Trends in Applied Artificial Intelligence (pp. 322-331). Springer Berlin Heidelberg.
[14] Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. the Journal of machine Learning research, 3, 993-1022.
[15] Mahajan, A., Dey, L., & Haque, S. M. (2008, December). Mining financial news for major events and their impacts on the market. In Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT'08. IEEE/WIC/ACM International Conference on (Vol. 1, pp. 423-426). IEEE.
[16] Jin, F., Self, N., Saraf, P., Butler, P., Wang, W., & Ramakrishnan, N. (2013, August). Forex-foreteller: Currency trend modeling using news articles. InProceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1470-1473). ACM.
[17] Quinlan, J. (1993). R. (1993) C4. 5: Programs for machine learning.
[18] Chen, N., Ribeiro, B., Vieira, A. S., Duarte, J., & Neves, J. C. (2011). A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Systems with Applications, 38(10), 12939-12945.
[19] Arun, R., Suresh, V., Madhavan, C. V., & Murthy, M. N. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. In Advances in Knowledge Discovery and Data Mining (pp. 391-402). Springer Berlin Heidelberg.
[20] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1), 10-18.