Margin-Based Feed-Forward Neural Network Classifiers
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32799
Margin-Based Feed-Forward Neural Network Classifiers

Authors: Han Xiao, Xiaoyan Zhu

Abstract:

Margin-Based Principle has been proposed for a long time, it has been proved that this principle could reduce the structural risk and improve the performance in both theoretical and practical aspects. Meanwhile, feed-forward neural network is a traditional classifier, which is very hot at present with a deeper architecture. However, the training algorithm of feed-forward neural network is developed and generated from Widrow-Hoff Principle that means to minimize the squared error. In this paper, we propose a new training algorithm for feed-forward neural networks based on Margin-Based Principle, which could effectively promote the accuracy and generalization ability of neural network classifiers with less labelled samples and flexible network. We have conducted experiments on four UCI open datasets and achieved good results as expected. In conclusion, our model could handle more sparse labelled and more high-dimension dataset in a high accuracy while modification from old ANN method to our method is easy and almost free of work.

Keywords: Max-Margin Principle, Feed-Forward Neural Network, Classifier.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1106373

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1692

References:


[1] J. Moody, S. Hanson, A. Krogh, and J. A. Hertz, “A simple weight decay can improve generalization,” Advances in neural information processing systems, vol. 4, pp. 950–957, 1995.
[2] G.-B. Huang, P. Saratchandran, and N. Sundararajan, “A generalized growing and pruning rbf (ggap-rbf) neural network for function approximation,” Neural Networks, IEEE Transactions on, vol. 16, no. 1, pp. 57–67, 2005.
[3] Y. Bengio, “Learning deep architectures for ai,” Foundations and trends R  in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009.
[4] R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in Proceedings of the 25th international conference on Machine learning, pp. 160–167, ACM, 2008.
[5] D. Ciresan, U. Meier, and J. Schmidhuber, “Multi-column deep neural networks for image classification,” in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 3642–3649, IEEE, 2012.
[6] G. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural computation, vol. 18, no. 7, pp. 1527–1554, 2006.
[7] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, “Greedy layer-wise training of deep networks,” Advances in neural information processing systems, vol. 19, p. 153, 2007.
[8] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in International Conference on Artificial Intelligence and Statistics, pp. 249–256, 2010.
[9] H. Larochelle, Y. Bengio, J. Louradour, and P. Lamblin, “Exploring strategies for training deep neural networks,” The Journal of Machine Learning Research, vol. 10, pp. 1–40, 2009.
[10] B. T. C. G. D. Roller, “Max-margin markov networks,” Advances in neural information processing systems, vol. 16, p. 25, 2004.
[11] G. Chechik, G. Heitz, G. Elidan, P. Abbeel, and D. Koller, “Max-margin classification of data with absent features,” The Journal of Machine Learning Research, vol. 9, pp. 1–21, 2008.
[12] R. Gilad-Bachrach, A. Navot, and N. Tishby, “Margin based feature selection-theory and algorithms,” in Proceedings of the twenty-first international conference on Machine learning, p. 43, ACM, 2004.
[13] B. Li, M. Chi, J. Fan, and X. Xue, “Support cluster machine,” in Proceedings of the 24th international conference on Machine learning, pp. 505–512.
[14] T. N. Huynh and R. J. Mooney, “Online max-margin weight learning for markov logic networks.,” in SDM, pp. 642–651, 2011.
[15] M. Hoai and F. De la Torre, “Max-margin early event detectors,” International Journal of Computer Vision, vol. 107, no. 2, pp. 191–202, 2014.