**Commenced**in January 2007

**Frequency:**Monthly

**Edition:**International

**Paper Count:**30382

##### On Speeding Up Support Vector Machines: Proximity Graphs Versus Random Sampling for Pre-Selection Condensation

**Authors:**
Xiaohua Liu,
Juan F. Beltran,
Nishant Mohanchandra,
Godfried T. Toussaint

**Abstract:**

**Keywords:**
Data Mining,
Machine Learning,
Support Vector Machines,
random sampling,
proximity graphs,
relative-neighborhood graphs,
k-nearestneighbor graphs,
training data condensation

**Digital Object Identifier (DOI):**
doi.org/10.5281/zenodo.1061950

**References:**

[1] M.B. Almeida, A.P. Braga, and J.P. Braga, "SVM-KM: speeding SVMs learning with a priori cluster selection and k-means," In: Proceedings of the 6th Brazilian Symposium on Neural Networks, pp. 162-167, 2000.

[2] F. Angiulli, "Fast nearest neighbor condensation for large data sets classification," IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 11, pp. 1450-1464, Nov. 2007.

[3] F. Angiulli and A. Astorino, "Scaling up support vector machines using nearest neighbor condensation," IEEE Transactions on Neural Networks, vol. 21, no. 2, pp. 351-357, February 2010.

[4] B. Bhattacharya, K. Mukherjee and G.T. Toussaint, "Geometric decision rules for instance-based learning algorithms," Proc. Pattern Recognition and Machine Intelligence: First International Conference, S. K. Pal et al., (Eds.): LNCS 3776, Kolkata, India, pp. 60-69, December 20-22, 2005.

[5] L. Breiman, "Bagging predictors," Machine Learning, vol. 24, pp. 123- 140, 1996.

[6] D. Bryant and V. Moulton, "NeighborNet: An agglomerative algorithm for the construction of phylogenetic networks," Molecular Biology and Evolution, vol. 21, no. 2, pp. 255-265, 2004.

[7] C.J.C. Burges and B. Scholkopf, "Improving the accuracy and speed of support vector learning machines," In Proceedings of the 9th NIPS Conference, pages 375-381, 1997.

[8] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, "Choosing multiple parameters for support vector machines," Machine Learning, vol. 46, pp. 131-159, 2002.

[9] J. Chen and C.-L. Liu, "Fast multi-class sample reduction for speeding up support vector machines," Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, Beijing, China, September 18-21, 2011.

[10] T.M. Cover and P.E. Hart, "Nearest neighbor pattern classification," IEEE Transactions on Information Theory, vol. 13, pp. 21-27, 1967.

[11] B.V. Dasarathi, "Tandem fusion of nearest neighbor editing and condensing algorithms - data dimensionality effects," Proceedings of the 15th International Conference on Pattern Recognition, vol. 2, pp. 692- 695, 2000.

[12] T. Downs, K.E. Gates, and A. Masters, "Exact simplification of support vector solutions," Journal of Machine Learning Research, vol. 2, pp. 293-297, 2001.

[13] S. Fine and K. Scheinberg, "Ecient SVM training using low-rank kernel representation, Journal of Machine Learning Research, pp. 243- 264, 2009.

[14] O. Gascuel, "BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data," Molecular Biology and Evolution, vol. 14, no. 7, pp. 685-695, 1997.

[15] E. Gomez and P. Herrera, "Comparative analysis of music recordings from Western and Non-Western traditions by automatic tonal feature extraction," Empirical Musicology Review, vol. 3, pp. 140-156, 2008.

[16] D. Han, C. Han, Y. Yang, Y. Liu, and W. Mao, "Pre-extracting method for SVM classification based on the non-parametric K-NN rule," Proceedings of the 19th International Conference on Pattern Recognition, pp. 1-4, Dec. 8-11, 2008.

[17] P.E. Hart, "The condensed nearest neighbor rule," IEEE Transactions on Information Theory, vol. 14, pp. 515-516, 1968.

[18] D.H. Huson, "SplitsTree: A program for analyzing and visualizing evolutionary data," Bioinformatics, vol. 14, no. 10, pp. 68-73, 1998.

[19] J.W. Jaromczyk and G.T. Toussaint, "Relative neighborhood graphs and their relatives," Proc. of the IEEE, vol. 80, no. 9, September, pp. 1502- 1517, 1992.

[20] T. Joachims, "Making large-scale SVM learning practical," Advances, In: Kernel Methods - Support Vector Learning, MIT-Press, Cambridge, MA, 1999.

[21] S.S. Keerthi, O. Chapelle, and D. DeCoste, "Building support vector machines with reduced classifier complexity," Journal of Machine Learning Research, vol. 7, pp. 1493-1515, 2006.

[22] S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy, "Improvements to Platt-s SMO algorithm for SVM classifier design," Neural Computation, vol. 13, pp. 637-649, 2001.

[23] R. Koggalage and S. Halgamuge, "Reducing the number of training samples for fast support vector machine classification," Neural Information Processing - Letters and Reviews, vol. 2, no. 3, pp. 57-65, March 2004.

[24] Y.J. Lee and O.L. Mangasarian, "RSVM: Reduced support vector machines," In Proceedings of the First SIAM International Conference on Data Mining, SIAM, Chicago, April 5-7, 2001.

[25] D. Li and S. Simske, "Training set compression by incremental clustering," Journal of Pattern Recognition Research, vol. 1, pp. 56-64, 2011.

[26] X. Liang, R.-C. Chen, and X. Guo, "Pruning support vector machines without altering performances," IEEE Transactions on Neural Networks, vol. 19, no. 10, pp. 1792-1803, October 2008.

[27] M.E. Mavroforakis and S. Theodoridis, "A geometric approach to support vector machine (SVM) classification," IEEE Transactions on Neural Networks, vol. 17, no. 3, pp. 671-682, May 2006.

[28] E. Mayr, "Cladistic analysis or cladistic classification?" Zeitschrift fuer Zoologische Systematik und Evolutionsforschung, vol. 12, pp. 94-128, 1974.

[29] D.H. Nghi and L.C. Mai, "Training data selection for support vector machines model," Proceedings of the International Conference on Information and Electronics Engineering, vol. 6, Press, Singapore, 2011.

[30] N. Panda, E. Y. Chang, and G. Wu, "Concept boundary detection for speeding up SVMs," Proceedings of the 23 International Conference on Machine Learning, Pittsburgh, PA, 2006.

[31] J.C. Platt, "Fast training of support vector machines using sequential minimal optimization," In B. Scholkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods - Support Vector Learning, Cambridge, MA, MIT Press, 1998.

[32] T. Thies and F. Weber, "Optimal reduced-set vectors for support vector machines with a quadratic kernel," Neural Computation, vol. 16, pp. 1769-1777, 2004.

[33] G.T. Toussaint, B. K. Bhattacharya, and R. S. Poulsen, "The application of Voronoi diagrams to nonparametric decision rules," Proc. Computer Science and Statistics: 16th Symposium on the Interface, Atlanta, Georgia, March 14-16,1984, Published by North-Holland in 1985, Amsterdam, L. Billard, Ed., pp. 97-108.

[34] G.T. Toussaint and C. Berzan, "Proximity-graph instance-based learning, support vector machines, and high dimensionality: An empirical comparison." Proceedings of the Eighth International Conference on Machine Learning and Data Mining, July 16-19, 2012, Berlin, Germany. P. Perner (Ed.): MLDM 2012, LNAI 7376, pp. 222- 236, 2012. Springer-Verlag Berlin Heidelberg.

[35] G.T. Toussaint, "Geometric proximity graphs for improving nearest neighbor methods in instance-based learning and data mining," International Journal of Computational Geometry and Applications, vol. 15, April 2005, pp. 101-150.

[36] G.T. Toussaint, "The relative neighborhood graph of a finite planar set," Pattern Recognition, vol. 12, pp. 261-268, 1980.

[37] I. W. Tsang, J. T. Kwok, and P.-M. Cheung, "Core vector machines: Fast SVM training on very large data sets," Journal of Machine Learning Research, vol. 6, pp. 363-392, 2005.

[38] V. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, New York, NY, 1995.

[39] J. Wang, P. Neskovic, L. N. Cooper, "Selecting data for fast support vector machine training," Studies in Computational Intelligence (SCI), vol. 35, pp. 61-84, 2007.

[40] Y. Wang, C.G. Zhou, Y.X. Huang, Y.C. Liang, and X.W. Yang, "A boundary method to speed up training support vector machines," In: G. R. Liu et al. (eds), Computational Methods, Springer, Printed in the Netherlands, pp. 1209-1213, 2006.

[41] D.L. Wilson, "Asymptotic properties of nearest neighbor rules using edited-data," IEEE Transactions on Systems, Man, and Cybernetics, vol. 2, pp. 408-421, 1973.

[42] M.H. Yang and N. Ahuja, "A geometric approach to train support vector machines," In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2000 (CVPR 2000), June, Hilton Head Island, pp. 430-437, 2000.

[43] W. Zhang and I. King, "Locating support vectors via β-skeleton technique," In: Proceedings of the International Conference on Neural Information Processing (ICONIP), pp. 1423-1427, 2002.

[44] W. Zhang and I. King, "A study of the relationship between support vector machine and Gabriel graph." In: Proceedings of the IEEE International Joint Conference on Neural Networks, IJCNN, Honolulu, vol. 1, pp. 239-244, 2002.