An Improved Conjugate Gradient Based Learning Algorithm for Back Propagation Neural Networks
Authors: N. M. Nawi, R. S. Ransing, M. R. Ransing
Abstract:
The conjugate gradient optimization algorithm is combined with the modified back propagation algorithm to yield a computationally efficient algorithm for training multilayer perceptron (MLP) networks (CGFR/AG). The computational efficiency is enhanced by adaptively modifying initial search direction as described in the following steps: (1) Modification on standard back propagation algorithm by introducing a gain variation term in the activation function, (2) Calculation of the gradient descent of error with respect to the weights and gains values and (3) the determination of a new search direction by using information calculated in step (2). The performance of the proposed method is demonstrated by comparing accuracy and computation time with the conjugate gradient algorithm used in MATLAB neural network toolbox. The results show that the computational efficiency of the proposed method was better than the standard conjugate gradient algorithm.
Keywords: Adaptive gain variation, back-propagation, activation function, conjugate gradient, search direction.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1075778
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1524References:
[1] D.E. Rumelhart, G.E. Hinton, and R.J. Williams, Learning internal representations by error propagation. in D.E. Rumelhart and J.L. McClelland (eds), Parallel Distributed Processing, 1986. 1: p. 318-362.
[2] A. van Ooyen and B. Nienhuis, Improving the convergence of the backpropagation algorithm. Neural Networks, 1992. 5: p. 465-471.
[3] M. Ahmad and F.M.A. Salam, Supervised learning using the cauchy energy function. International Conference on Fuzzy Logic and Neural Networks, 1992.
[4] Pravin Chandra and Yogesh Singh, An activation function adapting training algorithm for sigmoidal feedforward networks. Neurocomputing, 2004. 61: p. 429-437.
[5] R.A. Jacobs, Increased rates of convergence through learning rate adaptation. Neural Networks, 1988. 1: p. 295-307.
[6] M.K. Weir, A method for self-determination of adaptive learning rates in back propagation. Neural Networks, 1991. 4: p. 371-379.
[7] X.H. Yu, G.A. Chen, and S.X. Cheng, Acceleration of backpropagation learning using optimized learning rate and momentum. Electronics Letters, 1993. 29(14): p. 1288-1289.
[8] Bishop C. M., Neural Networks for Pattern Recognition. 1995: Oxford University Press.
[9] R. Fletcher and M. J. D. Powell, A rapidly convergent descent method for nlinimization. British Computer J., 1963: p. 163-168.
[10] Fletcher R. and Reeves R. M., Function minimization by conjugate gradients. Comput. J., 1964. 7(2): p. 149-160.
[11] M. R. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear systerns. J. Research NBS, 1952. 49: p. 409.
[12] HUANG H.Y., A unified approach to quadratically convergent algorithms for function minimization. J. Optim. Theory Appl., 1970. 5: p. 405-423.
[13] Thimm G., Moerland F., and Emile Fiesler, The Interchangeability of Learning Rate an Gain in Back propagation Neural Networks. Neural Computation, 1996. 8(2): p. 451-460.
[14] Holger R. M. and Graeme C. D., The Effect of Internal Parameters and Geometry on the Performance of Back-Propagation Neural Networks. Environmental Modeling and Software, 1998. 13(1): p. 193-209.
[15] Eom K. and Jung K., Performance Improvement of Back propagation algorithm by automatic activation function gain tuning using fuzzy logic. Neurocomputing, 2003. 50: p. 439-460.
[16] Rumelhart D. E., Hinton G. E., and Williams R. J., Learning internal representations by back-propagation errors. Parallel Distributed Processing, 1986. 1 (Rumelhart D.E. et al. Eds.): p. 318-362.
[17] C.H. Chen and Hongtao Lai, An empirical study of the Gradient Descent and the Conjugate Gradient backpropagation neural networks. IEEE, 1992: p. 132-135.
[18] Curtis F. Gerald and Patrick O. Wheatley, Applied Numerical Analysis. Seventh Edition. 2004: Addison-Wesley.
[19] L.Prechelt, Proben1 - A set of Neural Network Bencmark Problems and Benchmarking Rules. Technical Report 21/94, 1994: p. 1-38.
[20] Adrian J. Sheperd, Second Order Methods for Neural Networks-Fast and Reliable Training Methods for Multi-layer Perceptrons, ed. J.G. Taylor. 1997: Springer. 143.
[21] Dave Watkins, Clementine's Neural Networks Technical Overview. Technical Report, 1997.
[22] Fisher R.A., The use of multiple measurements in taxonomic problems. Annals of Eugenics, 1936. 7: p. 179 -188.
[23] Erik Hjelmas and P.W. Munro, A comment on parity problem. Technical Report, 1999: p. 1-7.
[24] Mangasarian O. L. and W.W. H., Cancer diagnosis via linear programming. SIAM News, 1990. 23(5): p. 1-18.