×

On radial basis function nets and kernel regression: Statistical consistency, convergence rates, and receptive field size. (English) Zbl 0817.62031

Summary: Useful connections between radial basis function (RBF) nets and kernel regression estimators (KRE) are established. By using existence theoretical results obtained for KRE as tools, we obtain a number of interesting theoretical results for RBF nets. Upper bounds are presented for convergence rates of the approximation error with respect to the number of hidden units. The existence of a consistent estimator for RBF nets is proven constructively. Upper bounds are also provided for the pointwise and \(L_ 2\) convergence rates of the best consistent estimator for RBF nets as the number of both the samples and the hidden units tends to infinity.
Moreover, the problem of selecting the appropriate size of the receptive field of the radial basis function is theoretically investigated and the way this selection is influenced by various factors is elaborated. In addition, some results are also given for the convergence of the empirical error obtained by the least squares estimator for RBF nets.

MSC:

62G07 Density estimation
62G20 Asymptotic properties of nonparametric inference
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Barron, A. R., Approximation and estimation bounds for artificial neural networks, (Proceedings of 4th Annual Workshop on Computational Learning Theory (1991), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA), 243-249
[2] Bennett, G., Probability inequalities for the sums of independent random variable, Journal of American Statistical Association, 57, 33-45 (1962) · Zbl 0104.11905
[3] Botros, S. M.; Atkeson, C. G., Generalization properties of radial basis function, (Lippmann, R. P.; Moody, J. E.; Touretzky, D. S., Advances in neural information processing system 3 (1991), Morgan Kaufmann: Morgan Kaufmann San Mateo), 707-713
[4] Broomhead, D. S.; Lowe, D., Multivariable functional interpolation and adaptive networks, Complex Systems, 2, 321-323 (1988) · Zbl 0657.68085
[5] Chen, S.; Cowan, C. F.N.; Grant, P. M., Orthogonal least squares learning algorithm for radial basis function networks, IEEE Transactions on Neural Networks, 2, 302-309 (1991)
[6] Chow, Y. S.; Teicher, H., Probability theory, independence, interchangeability, martingale (1978), Springer: Springer New York · Zbl 0399.60001
[7] Corradit, V.; White, H., Regularized neural networks: Some convergence rate results (1992), University of California, Department of Economics: University of California, Department of Economics San Diego, Unpublished manuscript
[8] Devroye, L., A course in density estimation (1987), Birkhauser: Birkhauser Boston · Zbl 0617.62043
[9] Devroye, L., On the almost everywhere convergence of non-parametric regression function estimates, The Annals of Statistics, 9, 1310-1319 (1981) · Zbl 0477.62025
[10] Devroye, L.; Krzyzak, A., An equivalence theorem for \(L_1\) convergence of the kernel regression estimate, Journal of Statistical Planning and Inference, 23, 71-82 (1989) · Zbl 0686.62027
[11] Geman, S.; Bienenstock, E.; Doursat, R., Neural networks and the bias/variance dilemma, Neural Computation, 4, 1-58 (1992)
[12] Girosi, F.; Anzellotti, G., Convergence rates of approximation by translates, (MIT AI Memo, No. 1288 (1992), MIT: MIT Cambridge)
[13] Girosi, F.; Poggio, T., Networks and the best approximation property, (MIT AI Memo, No. 1164 (1989), MIT: MIT Cambridge) · Zbl 0714.94029
[14] Girosi, F.; Poggio, T., Networks and the best approximation property, Biological Cybernetics, 63, 169-176 (1990) · Zbl 0714.94029
[15] Greblicki, W.; Krzyzak, A.; Pawlak, M., Distribution-free consistency of kernel regression estimate, The Annals of Statistics, 12, 1570-1575 (1984) · Zbl 0551.62025
[16] Hartman, E. J.; Keeler, J. D.; Kowalski, J. M., Layered neural networks with Gaussian hidden units s universal approximations, Neural Computation, 2, 210-215 (1990)
[17] Hoeffding, W., Probability inequalities for sums of bounded random variables, Journal of the American Statistical Association, 58, 13-30 (1963) · Zbl 0127.10602
[18] Hornik, K., Approximation capabilities of multilayer feedforward networks, Neural Networks, 4, 251-257 (1991)
[19] Hornik, K.; Stinchcombe, S.; White, H., Multilayer feedforward networks are universal approximators, Neural Networks, 2, 359-366 (1989) · Zbl 1383.92015
[20] Jones, R. D.; Lee, Y. C.; Barnes, C. W., Information theoretic derivation of network architecture and learning algorithms, (Proceedings of International Joint Conference on Neural Networks, Vol. 1 (1991)), 473-478, 1991. Seattle
[21] Kardirkamanathan, V.; Niranjan, M.; Fallside, F., Sequential adaptation of radial basis function neural networks and its application to time-series prediction, (Lippmann, R. P.; Moody, J. E.; Touretzky, D. S., Advances in neural information processing system 3 (1991), Morgan Kaufmann: Morgan Kaufmann San Mateo), 721-727
[22] Kraaijveld, M. A.; Duin, R. P.W., Generalization capabilities of minimal kernel-based networks, (Proceedings of International Joint Conference on Neural Networks, Vol. 1 (1991)), 843-848, 1991. Seattle
[23] Krzyżak, A., The rates of convergence of kernel regression estimates and classification rules, IEEE Transactions on Information Theory, 32, 668-679 (1986) · Zbl 0614.62050
[24] Krzyżak, A., On exponential bounds on the Bayes risk of the kernal classification rule, IEEE Transactions on Information Theory, 37, 490-499 (1991) · Zbl 0742.62006
[25] Krzyżak, A.; Pawlak, M., The pointwise rate of convergence of the kernel regression estimate, Journal of Statistical Planning and Inference, 16, 159-166 (1987) · Zbl 0616.62050
[26] Lugosi, G.; Zeger, K., Nonparametric estimation via empirical risk minimization (1993), Manuscript submitted for publication
[27] Mel, B. W.; Omohundro, S. M., How receptive field parameters affect neural learning, (Lippmann, R. P.; Moody, J. E.; Touretzky, D. S., Advances in neural information processing system 3 (1991), Morgan Kaufmann: Morgan Kaufmann San Mateo), 757-763
[28] Moody, J.; Darken, J., Fast learning in networks of locallytuned processing units, Neural Computation, 1, 281-294 (1989)
[29] Nowlan, S. J., Max likelihood competition in RBF networks, (Tech. Rep. CRG-TR-90-2 (1990), University of Toronto, Department of Computer Science)
[30] Park, J.; Sandberg, I. W., Universal approximation using radial-basis-function networks, Neural Computation, 3, 246-257 (1991)
[31] Park, J.; Sandberg, I. W., Universal approximation using radial-basis-function networks, Neural Computation, 5, 305-316 (1993)
[32] Platt, J. C., Learning by combining memorization and gradient descent, (Lippmann, R. P.; Moody, J. E.; Touretzky, D. S., Advances in neural information processing system 3 (1991), Morgan Kaufmann: Morgan Kaufmann San Mateo), 714-720
[33] Poggio, T.; Girosi, F., A theory of networks for approximation and learning, (MIT AI Memo. No. 1140 (1989), MIT: MIT Cambridge) · Zbl 1226.92005
[34] Poggio, T.; Girosi, F., Networks for approximation and learning, (Proceedings of the IEEE, 78 (1990)), 1481-1497 · Zbl 1226.92005
[35] Powell, M. J.D., Radial basis functions for multivariable interpolation: A review, (Mason, J. C.; Cox, M. G., Algorithms for approximation (1987), Clarendon Press: Clarendon Press Oxford) · Zbl 0638.41001
[36] Renals, S.; Rohwer, R., Phoneme classification experiments using radial basis functions, (Proceedings of International Joint Conference on Neural Networks, Vol. 1 (1989)), 462-467, 1989. Washington, DC
[37] Specht, D. F., Probabilistic neural networks, Neural Networks, 3, 109-118 (1990)
[38] Stokbro, K.; Umberger, D. K.; Hertz, J. A., Exploiting neurons with localized receptive fields to learn chaos (1990), Nordita: Nordita Copenhagen, Denmark, Preprint 90/28 S · Zbl 0717.92004
[39] Weymaere, N.; Martens, J., A fast robust learning algorithm for feed-forward neural networks, Neural Networks, 4, 361-369 (1991)
[40] Wheeden, R. L.; Zygmund, A., Measure and integral (1977), Marcel Dekker: Marcel Dekker New York
[41] White, H., Connectionist nonparametric regression: Multilayer feedforward networks that can learn arbitrary mappings, Neural Networks, 3, 535-549 (1990)
[42] Xu, L.; Klasa, S.; Yuille, A. L., Recent advances on techniques static feedforward networks with supervised learning, International Journal of Neural Systems, 3, 3, 253-290 (1992)
[43] Xu, L.; Krzyżak, A.; Oja, E., Rival penalized competitive learning for clustering analysis, RBF net and curve detection, IEEE Transactions on Neural Networks, 4, 636-649 (1993)
[44] Xu, L.; Krzyżak, A.; Yuille, A. L., On radial basis function nets and kernel regression: Approximation ability, convergence rate and receptive field size, (Tech. Rep. No. 92-4 (1992), Harvard University, Harvard Robotics Laboratory) · Zbl 0817.62031
[45] Yuille, A. L.; Grzywacz, N. M., A mathematical analysis of the motion coherence theory, International Journal of Computer Vision, 3, 155-175 (1989)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.