A Family of Distributions on Learnable Problems without Uniform Convergence
Authors: César Garza
Abstract:
In supervised binary classification and regression problems, it is well-known that learnability is equivalent to uniform convergence of the hypothesis class, and if a problem is learnable, it is learnable by empirical risk minimization. For the general learning setting of unsupervised learning tasks, there are non-trivial learning problems where uniform convergence does not hold. We present here the task of learning centers of mass with an extra feature that “activates” some of the coordinates over the unit ball in a Hilbert space. We show that the learning problem is learnable under a stable RLM rule. We introduce a family of distributions over the domain space with some mild restrictions for which the sample complexity of uniform convergence for these problems must grow logarithmically with the dimension of the Hilbert space. If we take this dimension to infinity, we obtain a learnable problem for which the uniform convergence property fails for a vast family of distributions.
Keywords: Statistical learning theory, learnability, uniform convergence, stability, regularized loss minimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 365References:
[1] V. N. Vapnik and A. Y. Chervonenkis, “On the uniform convergence of relative frequencies of events to their probabilities,” in Measures of complexity, pp. 11–30, Springer, Cham, 2015. Reprint of Theor. Probability Appl. 16 (1971), 264–280.
[2] M. J. Kearns, R. E. Schapire, and L. M. Sellie, “Toward efficient agnostic learning,” Machine Learning, vol. 17, pp. 115–141, 1994.
[3] B. K. Natarajan, “On learning sets and functions,” Machine Learning, vol. 4, pp. 67–97, 2004.
[4] S. Shalev-Shwartz, O. Shamir, N. Srebro, and K. Sridharan, “Learnability, stability and uniform convergence,” J. Mach. Learn. Res., vol. 11, pp. 2635–2670, 2010.
[5] S. Shalev-Shwartz and S. Ben-David, Understanding Machine Learning - From Theory to Algorithms. Cambridge University Press, 2014.
[6] N. Alon, S. Ben-David, N. Cesa-Bianchi, and D. Haussler, “Scale-sensitive dimensions, uniform convergence, and learnability,” J. ACM, vol. 44, no. 4, pp. 615–631, 1997.