×

Fully Bayesian analysis of the relevance vector machine with an extended hierarchical prior structure. (English) Zbl 1213.62043

Summary: This paper proposes an extended hierarchical hyperprior structure for kernel regression with the goal of solving the so-called Neyman-Scott problem [J. Neyman and E. L. Scott, Econometrica, Chicago 16, 1–32 (1948; Zbl 0034.07602)] inherent in the now very popular relevance vector machine (RVM). We conjecture that the proposed prior helps achieve consistent estimates of the quantities of interest, thereby overcoming a limitation of the original RVM for which the estimates of the quantities of interest are shown to be inconsistent. Unlike the majority of other authors in this area who typically use an empirical Bayes approach for RVM, we adopt a fully Bayesian approach. Our consistency claim at this stage remains only a conjecture, to be proved theoretically in a subsequent paper. However, we use a computational argument to demonstrate the merits of the proposed solution.

MSC:

62F15 Bayesian inference
62G08 Nonparametric regression and quantile regression
68T05 Learning and adaptive systems in artificial intelligence
65C60 Computational problems in statistics (MSC2010)

Citations:

Zbl 0034.07602
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Barron, A.; Schervish, M. J.; Wasserman, L., The consistency of posterior distributions in nonparametric problems, The Annals of Statistics, 27, 2, 536-561 (1999) · Zbl 0980.62039
[2] G. Camps-Valls, et al. Relevance vector machines for sparse learning of biophysical parameters, in: Proceedings of SPIE, the International Society of Optical Engineering. Image and Signal Processing for Remote Sensing, vol. 5982, 2005, pp. 59820Z.1-59820Z.12.; G. Camps-Valls, et al. Relevance vector machines for sparse learning of biophysical parameters, in: Proceedings of SPIE, the International Society of Optical Engineering. Image and Signal Processing for Remote Sensing, vol. 5982, 2005, pp. 59820Z.1-59820Z.12.
[3] Camps-Valls, G., Retrieval of oceanic chlorophyll concentration with relevance vector machines, Remote Sensing of Environment, 105, 1, 23-33 (2006)
[4] Chen, S., The relevance vector machine technique for channel equalization application, IEEE Trans. Neural Netw., 20, 6, 1529-1532 (2001)
[5] T. Choi, R.V. Ramamoorthi, Remarks on consistency of posterior distributions, in: IMS Collections: Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, vol. 3, IMS, 2008, pp. 170-186, doi:10.1214/074921708000000138; T. Choi, R.V. Ramamoorthi, Remarks on consistency of posterior distributions, in: IMS Collections: Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, vol. 3, IMS, 2008, pp. 170-186, doi:10.1214/074921708000000138
[6] N. Dasgupta, et al. Relevance vector machine quantization and density function estimation: application to hmm-based multi-aspect text classification, Technical Report. Duke University, 2007.; N. Dasgupta, et al. Relevance vector machine quantization and density function estimation: application to hmm-based multi-aspect text classification, Technical Report. Duke University, 2007.
[7] Diaconis, P.; Freedman, D., On the Consistency of Bayes Estimates, The Annals of Statistics, 14, 1, 1-26 (1986) · Zbl 0595.62022
[8] Diaconis, P. W.; Freedman, D., Consistency of Bayes estimates for nonparametric regression: normal theory, Bernoulli, 4, 4, 411-444 (1998) · Zbl 1037.62031
[9] D’Souza, A., The bayesian backfitting relevance vector machine, (Proceedings of the 21st International Conference on Machine Learning (2004), Banff: Banff Canada)
[10] Figueiredo, M. A.T., Adaptive sparseness for supervised learning, IEEE Transactions on Pattern Analysis and Mach. Intell., 25, 1150-1159 (1954)
[11] J.P. Florens, A. Simoni, Regularized posteriors in linear ill-posed inverse problems, Technical Report. Toulouse School of Economics, Toulouse, France, 2008.; J.P. Florens, A. Simoni, Regularized posteriors in linear ill-posed inverse problems, Technical Report. Toulouse School of Economics, Toulouse, France, 2008. · Zbl 1246.62039
[12] Fokoué, E., Estimation of atom prevalence for optimal prediction, Contemporary Mathematics, 443, 103-129 (2008) · Zbl 1136.62026
[13] E. Fokoué, P. Goel, An optimal experimental design perspective on redial basis function regression, Technical Report. 2010. http://hdl.handle.net/1850/11694; E. Fokoué, P. Goel, An optimal experimental design perspective on redial basis function regression, Technical Report. 2010. http://hdl.handle.net/1850/11694
[14] Ghosal, S.; Ghosh, J. K.; van der Vaart, Aad W., Convergence rates of posterior distributions, The Annals of Statistics, 28, 2, 500-531 (2000) · Zbl 1105.62315
[15] Kleijn, B. J.K.; van der Vaart, A. W., Misspecification in infinite-dimensional Bayesian statistics, The Annals of Statistics, 34, 2, 837-877 (2006) · Zbl 1095.62031
[16] Neyman, J.; Scott, M., Consistent estimates based on partially consistent observations, Econometrika, 16, 1-32 (1948) · Zbl 0034.07602
[17] W. Ploberger, P.C.B. Phillips, Best empirical models when the parameter space is infinite dimensional, Technical Report, University of Rochester, Rochester, New York, USA, 2008.; W. Ploberger, P.C.B. Phillips, Best empirical models when the parameter space is infinite dimensional, Technical Report, University of Rochester, Rochester, New York, USA, 2008.
[18] Shen, X.; Wasserman, L., Rates of convergence of posterior distributions, Ann. Statist, 29, 3, 687-714 (2001) · Zbl 1041.62022
[19] C. Silva, B. Ribeiro, Combining active learning and relevance vector machines for text classification, in: Proceedings of the IEEE International Conference on Machine Learning Applications, 2007, pp. 130-135.; C. Silva, B. Ribeiro, Combining active learning and relevance vector machines for text classification, in: Proceedings of the IEEE International Conference on Machine Learning Applications, 2007, pp. 130-135.
[20] Silva, C.; Ribeiro, B., RVM ensemble for text classification, International Journal of Computational Intelligence Research, 3, 1, 31-35 (2007)
[21] Silva, C.; Ribeiro, B., Towards expanding relevance vector machines to large scale datasets, International Journal of Neural Systems, 18, 1, 45-58 (2008), WSPC
[22] A. Thayananthan, et al. Multivariate relevance vector machines for tracking, Technical Report, Cambridge University, 2008.; A. Thayananthan, et al. Multivariate relevance vector machines for tracking, Technical Report, Cambridge University, 2008.
[23] Tipping, M. E., Sparse Bayesian learning and the relevance vector machine, Journal of Machine Learning Research, 1, 211-244 (2001) · Zbl 0997.68109
[24] Tripathi, S.; Govindaraju, R. S., On Selection of kernel parameters in relevance vector machines for hydrologic applications, Stochastic Environmental Research and Risk Assessment, 21, 747-764 (2007) · Zbl 1231.62197
[25] Vapnik, V. N., The Nature of Statistical Learning Theory (1995), Springer-Verlag: Springer-Verlag Berlin · Zbl 0833.62008
[26] G. Wahba, An introduction to model building with reproducing kernel hilbert spaces, Technical Report No. 1020, Department of Statistics, University of Wisconsin 1210 West Dayton St. Madison, WI 53706, USA, April 18, 2000.; G. Wahba, An introduction to model building with reproducing kernel hilbert spaces, Technical Report No. 1020, Department of Statistics, University of Wisconsin 1210 West Dayton St. Madison, WI 53706, USA, April 18, 2000.
[27] Wei, L., Relevance vector machine for automatic detection of clustered microcalcifications, IEEE Transactions on Medical Imaging, 24, 10, 1278-1285 (2005), http://www.ncbi.nlm.nih.gov/pubmed/16229415
[28] R.J. Weiss, D.P.W. et Ellis, Estimating single channel source separation masks: relevance vector machine classifiers vs pitch-based masking, Technical Report, Dept. of Elec. Eng, Columbia University, New York, NY 10027, USA, 2005.; R.J. Weiss, D.P.W. et Ellis, Estimating single channel source separation masks: relevance vector machine classifiers vs pitch-based masking, Technical Report, Dept. of Elec. Eng, Columbia University, New York, NY 10027, USA, 2005.
[29] Wipf, D.; Nagarajan, Srikantan, Beamforming using the relevance vector machine, (Proceedings of the 24th International Conference on Machine Learning (2007), Corvallis: Corvallis OR, USA)
[30] W.S. Wong, et al. Using a sparse learning relevance vector machine in facial expression recognition, Technical Report, Man-Machine Interaction Group, Delft University of Technology, The Netherlands, 2005, eMail: L.J.M.Rothkrantz@ewi.tudelft.nl; W.S. Wong, et al. Using a sparse learning relevance vector machine in facial expression recognition, Technical Report, Man-Machine Interaction Group, Delft University of Technology, The Netherlands, 2005, eMail: L.J.M.Rothkrantz@ewi.tudelft.nl
[31] L. Yinhai, et al. Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine, Genome Research, (2006) online, 2006.; L. Yinhai, et al. Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine, Genome Research, (2006) online, 2006.
[32] Yuan, J., Integrating relevance vector machines and genetic algorithms for optimization of seed-separating process, Engineering Applications of Artificial Intelligence, 20, 970-979 (2007)
[33] Z. Zhang, M.I. Jordan, D. Yeung, Posterior consistency of the silverman g-prior in bayesian model choice, Technical Report, Number xx, Department of Electrical Engineering and Computer Science, University of California, Berkeley, California, USA, 2008.; Z. Zhang, M.I. Jordan, D. Yeung, Posterior consistency of the silverman g-prior in bayesian model choice, Technical Report, Number xx, Department of Electrical Engineering and Computer Science, University of California, Berkeley, California, USA, 2008.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.