×

Neural networks with quadratic VC dimension. (English) Zbl 0869.68089

Summary: This paper shows that neural networks which use continuous activation functions have VC dimension at least as large as the square of the number of weights \(w\). This results settles a long-standing open question, namely whether the well-known \(O(w\log w)\) bound, known for hard-threshold nets, also held for more general sigmoidal nets. Implications for the number of samples needed for valid generalization are discussed.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] N. Alon, S. Ben-David, N. Cesa-Bianchi, D. Haussler, Scale- sensitive dimensions, uniform convergence, and learnability, Proceedings, 34th IEEE Symp. on Foundations of Computer Science, 1993; N. Alon, S. Ben-David, N. Cesa-Bianchi, D. Haussler, Scale- sensitive dimensions, uniform convergence, and learnability, Proceedings, 34th IEEE Symp. on Foundations of Computer Science, 1993 · Zbl 0891.68086
[2] Anthony, M.; Biggs, N. L., Computational Learning Theory: An Introduction (1992), Cambridge Univ. Press: Cambridge Univ. Press Cambridge · Zbl 0755.68115
[3] Baum, E. B.; Haussler, D., What size net gives valid generalization?, Neural Comput., 1, 151-160 (1989)
[4] Blum, L.; Shub, M.; Smale, S., On the theory of computation and complexity over the real numbers: NP-completeness, recursive functions and universal machines, Bull. Amer. Math. Soc., 21, 1-46 (1989) · Zbl 0681.03020
[5] Blumer, A.; Ehrenfeucht, A.; Haussler, D.; Warmuth, M., Learnability and the Vapnik-Chervonenkis dimension, J. Assoc. Comput. Mach., 36, 929-965 (1989) · Zbl 0697.68079
[6] Cover, T. M., Capacity problems for linear machines, Pattern Recognition (1988), Thompson: Thompson Washington
[7] Goldberg, P.; Jerrum, M., Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers, Mach. Learning, 18, 131-148 (1995) · Zbl 0831.68087
[8] M. Karpinski, A. Macintyre, Polynomial bounds for VC dimension of sigmoidal neural networks, Proceedings, 27th ACM Symposium on Theory of Computing, 1995; M. Karpinski, A. Macintyre, Polynomial bounds for VC dimension of sigmoidal neural networks, Proceedings, 27th ACM Symposium on Theory of Computing, 1995 · Zbl 0978.68562
[9] Lang, K.; Hinton, G. E., The Development of TDNN Architecture for Speech Recognition. The Development of TDNN Architecture for Speech Recognition, Technical Report CMU-CS-88-152 (1988), Carnegie-Mellon University
[10] W. Maass, Bounds for the computational power and learning complexity of analog neural nets, Proceedings, 25th ACM Symposium on Theory of Computing, 1993; W. Maass, Bounds for the computational power and learning complexity of analog neural nets, Proceedings, 25th ACM Symposium on Theory of Computing, 1993 · Zbl 1310.68182
[11] Maass, W., Perspectives of current research about the complexity of learning in neural nets, (Roychowdhury, V. P.; Siu, K. Y.; Orlitsky, A., Theoretical Advances in Neural Computation and Learning (1994), Kluwer: Kluwer Boston) · Zbl 0835.68097
[12] Natarajan, B. K., Machine Learning: A Theoretical Approach (1991), Morgan Kaufmann: Morgan Kaufmann San Mateo · Zbl 0722.68093
[13] Shawe-Taylor, J., Threshold network learning in the presence of equivalences, Advances in Neural Information Processing Systems (1992), Morgan Kaufmann: Morgan Kaufmann San Mateo
[14] Sontag, E. D., Sigmoids distinguish better than Heavisides, Neural Comput., 1, 470-472 (1989)
[15] Sontag, E. D., Feedforward nets for interpolation and classification, J. Comput. System Sci., 45, 20-48 (1992) · Zbl 0791.68141
[16] L. G. Valiant, 1984, A theory of the learnable, Commun. ACM, 27, 1134, 1142; L. G. Valiant, 1984, A theory of the learnable, Commun. ACM, 27, 1134, 1142 · Zbl 0587.68077
[17] Vapnik, V. N., Estimation of Dependencies Based on Empirical Data (1982), Springer-Verlag: Springer-Verlag Berlin
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.