The Application of an Ensemble of Boosted Elman Networks to Time Series Prediction: A Benchmark Study
Authors: Chee Peng Lim, Wei Yee Goh
Abstract:
In this paper, the application of multiple Elman neural networks to time series data regression problems is studied. An ensemble of Elman networks is formed by boosting to enhance the performance of the individual networks. A modified version of the AdaBoost algorithm is employed to integrate the predictions from multiple networks. Two benchmark time series data sets, i.e., the Sunspot and Box-Jenkins gas furnace problems, are used to assess the effectiveness of the proposed system. The simulation results reveal that an ensemble of boosted Elman networks can achieve a higher degree of generalization as well as performance than that of the individual networks. The results are compared with those from other learning systems, and implications of the performance are discussed.
Keywords: AdaBoost, Elman network, neural network ensemble, time series regression.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1334357
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1697References:
[1] R.N. Miller and L.L. Ehret, "Ensemble generation for models of multimodal systems", Monthly Weather Review, vol. 130, pp. 2313- 2333, 2002.
[2] J. G. Carney and P. Cunningham, "The NeuralBAG algorithm: Optimizing generalization performance in bagged neural networks," Proc. 7th European Symp. Artificial Neural Networks, M. Verleysen, Ed. D-Facto, Brussels, 1999 pp. 35-40.
[3] J. L. Elman, "Finding structure in time," Cognitive Science, vol. 14, pp. 179-211, 1990.
[4] Y. Freund and R. E. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting," Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119-139, 1997.
[5] T. Masters, Practical Neural Network Recipes in C++. San Diego, CA: Academic Press, Inc., 1993.
[6] L. K. Hansen and P. Salamon, "Neural network ensembles," IEEE Trans. Pattern Anal. Machine Intell., vol. 12, no. 10, pp. 993-1001, 1990.
[7] Y. Liu and X. Yao, "Negatively correlated neural networks can produce best ensembles," Australian Journal of Intelligent Information Processing Systems, vol. 4, no. 3/4, pp. 176-185, 1997.
[8] F. Fessant, S. Bengio, and D. Collobert, "On the prediction of solar activity using different neural network models," Annales Geophysicae, vol. 14, pp. 20-26, 1995.
[9] H. Schwenk and Y. Bengio, "Boosting Neural Networks," Neural Computation, vol. 12, no. 8, pp. 1869-1887, 2000.
[10] R. E. Schapire and Y. Singer, "Improved boosting algorithms using confidence-rated predictions," Machine Learning, vol. 37, no. 3, pp. 297-336, 1999.
[11] J.R. Quinlan, "Bagging, Boosting, and C4.5" Proc of the Thirteenth National Conference on Artificial Intelligence and the Eighth Innovative Applications of Artificial Intelligence Conference, pp. 725-730, 1996.
[12] T. G. Dietterich, "An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization," Machine Learning, vol. 40, no. 2, pp. 139-158, 2000.
[13] R. S. Zemel and T. Pitassi, "A gradient-based boosting algorithm for regression problems," Advances in Neural Information Processing Systems 13, T. Leen, T. Dietterich, and V. Tresp eds., the MIT Press, 2001, pp. 696-702.
[14] H. Drucker, "Fast committee machines for regression and classification," Third Int. Conf. Knowledge Discovery and Data Mining (KDD-97), D. Heckerman, H. Mannila, D. Pregibon, and R. Uthurusamy eds., Menlo Park, CA: AAADietterichI Press, 1997, pp. 159-162.
[15] H. Drucker, "Improving regressors using boosting techniques," Proc. Fourteenth Int. Conf. Machine Learning (ICML-97), D. H. Fisher, Ed. Morgan Kaufmann, 1997, pp. 107-115.
[16] S. Borra and A. Di Ciaccio, "Improving nonparametric regression methods by bagging and boosting," Computational Statistics & Data Analysis, vol. 38, pp. 407-420, 2002.
[17] G. Giacinto and F. Roli, "An approach to the automatic design of multiple classifier systems," Pattern Recognition Letters, vol. 22, pp. 25- 33, 2001.
[18] P. Stagge and B. Sendhoff, "An extended Elman net for modeling time series," Int. Conf. Artificial Neural Networks (ICANN'97), W. Gerstner, A. Germond, M. Hasler, and J. Nicoud, eds., Springer Verlag, 1997, vol. 1327 of Lecture Notes in Computer Science, pp. 427-432.
[19] W.Y. Goh, C.P. Lim, and K.K. Peh, "Predicting Drug Dissolution Profiles with an Ensemble of Boosted Neural Networks: A Time-series Approach", IEEE Trans. on Neural Networks, vol. 14, pp. 459-463, 2003. Y. Freund and R. E. Schapire, "A short introduction to boosting," Journal of Japanese Society for Artificial Intelligence, vol. 14, no. 5, pp. 771-780, (Appearing in Japanese, translation by Naoki Abe.), 1999.
[20] D. Nguyen and B. Widrow, "Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights," Proc. Int. Joint Conf. Neural Networks, vol. 3, pp. 21-26, 1990.
[21] D. E. Rumelhart, G. E. Hinton and R. J. Williams, "Learning internal representations by error propagation". Parallel Distributed Processing: Explorations in the microstructure of cognition, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press, 1986 vol. 1, pp. 318- 362.
[22] M. N├©rgaard, O. Ravn, N. K. Poulsen, and L. K. Hansen. (2000). Neural Networks for Modelling and Control of Dynamic Systems. London: Springer-Verlag. (Online). Available: http://www.iau.dtu.dk/nnspringer.html
[23] A. A. M. Khalaf and K. Nakayama, "A cascade form predictor of neural and FIR filters and its minimum size estimation based on nonlinearity analysis of time series," IEICE Trans. Fundamentals, vol. E81-A, no. 3, pp. 364-373, 1998.
[24] J. A. Leonard, M. A. Kramer, and L. H. Ungar, "A neural network architecture that computes its own reliability," Computers & Chemical Engineering, vol. 16, no. 9, pp. 819-835, 1992.
[25] G. E. P. Box and G. M. Jenkins, Time Series Analysis, Forecasting and Control. San Francisco: Holden-Day, 1970.
[26] J. Kim and N. Kasabov, "HyFIS: Adaptive neuro-fuzzy inference systems and their application to nonlinear dynamical systems," Neural Networks, vol. 12, pp. 1301-1319, 1999.
[27] N. Kasabov, J. Kim, M. Watts, and A. Gray, "FuNN/2 ÔÇö A fuzzy neural network architecture for adaptive learning and knowledge acquisition," Information Sciences, vol. 101, no.3-4, pp. 155-175, 1997.
[28] J.-S. R. Jang, C.-T. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River, NJ: Prentice-Hall, 1997.
[29] W. Hauptmann and K. Heesche, "A neural net topology for bidirectional fuzzy-neuro transformation," Proc. IEEE Int. Conf. Fuzzy Systems (FUZZ-IEEE/IFES), Yokohama, Japan, 1995, pp. 1511-1518.
[30] H. Surmann, A. Kanstein, and K. Goser, "Self-organizing and genetic algorithms for an automatic design of fuzzy control and decision systems," Proc. First European Congress on Fuzzy and Intelligent Technologies (EUFIT-93), Aachen, 1993, vol. 1, pp. 1097-1104.
[31] W. Pedrycz, "An identification algorithm in fuzzy relational systems," Fuzzy Sets and Systems, vol. 13, pp. 153-167, 1984.
[32] C.-W. Xu and Y.-Z. Lu, "Fuzzy model identification and self-learning for dynamic systems," IEEE Trans. Syst., Man, Cybern., vol. 17 no. 4, pp. 683-689, 1987.
[33] M. Sugeno and T. Yasukawa, "Linguistic modelling based on numerical data," Proc. Fourth Int. Fuzzy Systems Association World Congress (IFSA-91), R. Lowen and M. Roubens, Eds. Br├╝ssels, Belgium: Computer, Management & Systems Science, 1991, pp. 264-267.
[34] W. Pedrycz, P. C. F. Lam, and A. F. Rocha, "Distributed fuzzy system modelling," IEEE Trans. Syst., Man, Cybern., vol. 25, no. 5, pp. 769- 780, 1995.
[35] Y.-C. Lee, C. Hwang, and Y.-P. Shih, "A combined approach to fuzzy model identification," IEEE Trans. Syst., Man, Cybern., vol. 24, no. 5, pp. 736-744, 1994.
[36] R. M. Tong, "The evaluation of fuzzy models derived from experimental data," Fuzzy Sets and Systems, vol. 4, pp. 1-12, 1980.
[37] R. E. Schapire, Y. Freund, P. Bartlett, and W. S. Lee, "Boosting the margin: A new explanation for the effectiveness of voting methods," Annals of Statistics, vol. 26, no. 5, pp. 1651-1686, 1998.