Evolutionary Training of Hybrid Systems of Recurrent Neural Networks and Hidden Markov Models

Rohitash Chandra; Christian W. Omlin

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33123

Evolutionary Training of Hybrid Systems of Recurrent Neural Networks and Hidden Markov Models

Authors: Rohitash Chandra, Christian W. Omlin

Abstract:

We present a hybrid architecture of recurrent neural networks (RNNs) inspired by hidden Markov models (HMMs). We train the hybrid architecture using genetic algorithms to learn and represent dynamical systems. We train the hybrid architecture on a set of deterministic finite-state automata strings and observe the generalization performance of the hybrid architecture when presented with a new set of strings which were not present in the training data set. In this way, we show that the hybrid system of HMM and RNN can learn and represent deterministic finite-state automata. We ran experiments with different sets of population sizes in the genetic algorithm; we also ran experiments to find out which weight initializations were best for training the hybrid architecture. The results show that the hybrid architecture of recurrent neural networks inspired by hidden Markov models can train and represent dynamical systems. The best training and generalization performance is achieved when the hybrid architecture is initialized with random real weight values of range -15 to 15.

Keywords: Deterministic finite-state automata, genetic algorithm, hidden Markov models, hybrid systems and recurrent neural networks.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1079498

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1895

References:

[1] A.J Robinson, An application of recurrent nets to phone probability estimation, IEEE transactions on Neural Networks, vol.5, no.2 , 1994, pp. 298-305.
[2] C.L. Giles, S. Lawrence and A.C. Tsoi, Rule inference for financial prediction using recurrent neural networks, Proc. of the IEEE/IAFE Computational Intelligence for Financial Engineering, New York City, USA, 1997, pp. 253-259
[3] K. Marakami and H Taguchi, Gesture recognition using recurrent neural networks, Proc. of the SIGCHI conference on Human factors in computing systems: Reaching through technology, Louisiana, USA, 1991, pp. 237-242.
[4] T. Kobayashi, S. Haruyama, Partly-Hidden Markov Model and its Application to Gesture Recognition, Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing , vol. 4, 1997, pp.3081.
[5] P.A Stoll, J. Ohya, Applications of HMM modeling to recognizing human gestures in image sequences for a man-machine interface, Proc. of the 4th IEEE International Workshop on Robot and Human Communication, Tokyo, 1995, pp. 129-134.
[6] M. J. F. Gales, Maximum likelihood linear transformations for HMMbased speech recognition, Computer Speech and Language, vol. 12, 1998, pp. 75-98.
[7] T. Wessels, C.W. Omlin, Refining Hidden Markov Models with Recurrent Neural Networks, Proc. of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, vol. 2, 2000, pp. 2271.
[8] Kim Wing C. Ku, Man Wai Mak, and Wan Chi Siu. Adding learning to cellular genetic algorithms for training recurrent neural networks. IEEE Transactions on Neural Networks, vol. 10, no.2, 1999, pp. 239-252.
[9] Abbass Hussein, An evolutionary artificial neural network approach for breast cancer diagnosis. Artificial Intelligence in Medicine, vol. 25, no. 3, 2002, pp.265-281.
[10] Lee Giles, C.W Omlin and K. Thornber, Equivalence in Knowledge Representation: Automata, Recurrent Neural Networks, and dynamical Systems, Proc. of the IEEE, vol. 87, no. 9, 1999, pp.1623-1640
[11] P. Manolios and R. Fanelli, First order recurrent neural networks and deterministic finite state automata. Neural Computation, vol. 6, no. 6, 1994, pp.1154-1172.
[12] R. L. Watrous and G. M. Kuhn, Induction of finite-state languages using second-order recurrent networks, Proc. of Advances in Neural Information Systems, California, USA, 1992, pp. 309-316.
[13] T. Lin, B.G. Horne, P. Tino, & C.L. Giles, Learning long-term dependencies in NARX recurrent neural networks. IEEE Transactions on Neural Networks, vol. 7, no. 6, 1996, pp. 1329-1338.
[14] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, vol. 9, no. 8, 1997, pp. 1735-1780.
[15] E. Alpaydin, Introduction to Machine Learning, The MIT Press, London, 2004, pp. 306-311.
[16] P. J. Angeline, G. M. Sauders, and J. B. Pollack, An evolutionary algorithm that constructs recurrent neural networks, IEEE Trans. Neural Networks, vol. 5, 1994, pp. 54-65.
[17] Y. Bengio, Neural Networks for Speech and Sequence Recognition. London UK, International Thompson Computer Press, 1996.
[18] Y. LeCun, J. Denker and S. Solla, Optimal Brain Damage, Advances in Neural Information Processing Systems 2, Morgan Kaufman Publishers, San Mateo, CA, 1990.