×

Multiscale likelihood analysis and complexity penalized estimation. (English) Zbl 1048.62036

Summary: We describe here a framework for a certain class of multiscale likelihood factorizations wherein, in analogy to a wavelet decomposition of an \(L^2\) function, a given likelihood function has an alternative representation as a product of conditional densities reflecting information in both the data and the parameter vector localized in position and scale. The framework is developed as a set of sufficient conditions for the existence of such factorizations, formulated in analogy to those underlying a standard multiresolution analysis for wavelets, and hence can be viewed as a multiresolution analysis for likelihoods. We then consider the use of these factorizations in the task of nonparametric, complexity penalized likelihood estimation.
We study the risk properties of certain thresholding and partitioning estimators, and demonstrate their adaptivity and near-optimality, in a minimax sense over a broad range of function spaces, based on squared Hellinger distance as a loss function. In particular, our results provide an illustration of how properties of classical wavelet-based estimators can be obtained in a single, unified framework that includes models for continuous, count and categorical data types.

MSC:

62G05 Nonparametric estimation
62C20 Minimax procedures in statistical decision theory
42C40 Nontrigonometric harmonic analysis involving wavelets and other special systems
60E05 Probability distributions: general theory
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Bar-Lev, S. K. and Enis, P. (1986). Reproducibility and natural exponential families with power variance functions. Ann. Statist. 14 1507–1522. JSTOR: · Zbl 0657.62016 · doi:10.1214/aos/1176350173
[2] Barndorff-Nielsen, O. (1978). Information and Exponential Families in Statistical Theory. Wiley, New York. · Zbl 0387.62011
[3] Barron, A., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization. Probab. Theory Related Fields 113 301–413. · Zbl 0946.62036 · doi:10.1007/s004400050210
[4] Barron, A. R. and Cover, T. M. (1991). Minimum complexity density estimation. IEEE Trans. Inform. Theory 37 1034–1054. · Zbl 0743.62003 · doi:10.1109/18.86996
[5] Breiman, L., Friedman, J., Olshen, R. and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA. · Zbl 0541.62042
[6] Daubechies, I. (1992). Ten Lectures on Wavelets . SIAM, Philadelphia. · Zbl 0776.42018
[7] DeVore, R. A. (1998). Nonlinear approximation. In Acta Numerica 7 51–150. Cambridge Univ. Press. · Zbl 0931.65007
[8] Donoho, D. L. (1993). Unconditional bases are optimal bases for data compression and for statistical estimation. Appl. Comput. Harmon. Anal. 1 100–115. · Zbl 0796.62083 · doi:10.1006/acha.1993.1008
[9] Donoho, D. L. (1997). CART and best-ortho-basis: A connection. Ann. Statist. 25 1870–1911. · Zbl 0942.62044 · doi:10.1214/aos/1069362377
[10] Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1995). Wavelet shrinkage: Asymptopia? (with discussion). J. Roy. Statist. Soc. Ser. B 57 301–369. · Zbl 0827.62035
[11] Donoho, D. L., Liu, R. and MacGibbon, B. (1990). Minimax risk over hyperrectangles, and implications. Ann. Statist. 18 1416–1437. JSTOR: · Zbl 0705.62018 · doi:10.1214/aos/1176347758
[12] Girardi, M. and Sweldens, W. (1997). A new class of unbalanced Haar wavelets that form an unconditional basis for \(L_p\) on general measure spaces. J. Fourier Anal. Appl. 3 457–474. · Zbl 0883.42025 · doi:10.1007/BF02649107
[13] Joshi, S. W. and Patil, G. P. (1970). A class of statistical models for multiple counts. In Random Counts in Scientific Work (G. P. Patil, ed.) 2 189–203. Pennsylvania State Univ. Press.
[14] Kolaczyk, E. D. (1999a). Bayesian multiscale models for Poisson processes. J. Amer. Statist. Assoc. 94 920–933. · Zbl 1072.62630 · doi:10.2307/2670007
[15] Kolaczyk, E. D. (1999b). Some observations on the tractability of certain multi-scale models. In Bayesian Inference in Wavelet-Based Models . Lecture Notes in Statist. 141 51–66. Springer, New York. · Zbl 1069.62510
[16] Kolaczyk, E. D. and Huang, H. (2001). Multiscale statistical models for hierarchical spatial aggregation. Geographical Analysis 33 95–118.
[17] Lauritzen, S. L. (1996). Graphical Models. Oxford Univ. Press. · Zbl 0907.62001
[18] Li, Q. J. (1999). Estimation of mixture models. Ph.D. dissertation, Dept. Statistics, Yale Univ.
[19] Li, Q. J. and Barron, A. R. (2000). Mixture density estimation. In Advances in Neural Information Processing Systems 12 279–285. MIT Press, Cambridge, MA.
[20] Nowak, R. D. (1999). Multiscale hidden Markov models for Bayesian image analysis. In Bayesian Inference in Wavelet-Based Models . Lecture Notes in Statist. 141 243–265. Springer, New York. · Zbl 0940.62091
[21] Nowak, R. D. and Kolaczyk, E. D. (2000). A statistical multiscale framework for Poisson inverse problems. IEEE Trans. Inform. Theory 46 1811–1825. · Zbl 0999.94004 · doi:10.1109/18.857793
[22] Sweldens, W. (1998). The lifting scheme: A construction of second generation wavelets. SIAM J. Math. Anal. 29 511–546. · Zbl 0911.42016 · doi:10.1137/S0036141095289051
[23] Timmermann, K. E. and Nowak, R. D. (1999). Multiscale modeling and estimation of Poisson processes with application to photon-limited imaging. IEEE Trans. Inform. Theory 45 846–862. · Zbl 0947.94005 · doi:10.1109/18.761328
[24] Wilks, S. S. (1962). Mathematical Statistics. Wiley, New York. · Zbl 0173.45805
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.