×

Goodness-of-fit tests via phi-divergences. (English) Zbl 1126.62030

Summary: A unified family of goodness-of-fit tests based on \(\varphi \)-divergences is introduced and studied. The new family of test statistics \(S_n(s)\) includes both the supremum version of the Anderson-Darling statistic and the test statistic of R. H. Berk and D. H. Jones [Z. Wahrscheinlichkeitstheor. Verw. Geb. 47, 47–59 (1979; Zbl 0379.62026)] as special cases (\(s=2\) and \(s=1\), resp.). We also introduce integral versions of the new statistics. We show that the asymptotic null distribution theory of R. H. Berk and D. H. Jones [loc. cit.] and J. A. Wellner and V. Koltchinskii [J. Hoffmann-Jørgensen et al., High Dimensional Probability III, 321–332 (2003)] for the Berk-Jones statistic applies to the whole family of statistics \(S_n(s)\) with \(s \in[-1, 2]\). On the side of power behavior, we study the test statistics under fixed alternatives and give extensions of the “Poisson boundary” phenomena noted by Berk and Jones for their statistic. We also extend the results of D. Donoho and J. Jin [Ann. Stat. 32, No. 3, 962–994 (2004; Zbl 1092.62051)] by showing that all our new tests for \(s \in [-1,2]\) have the same “optimal detection boundary” for normal shift mixture alternatives as Tukey’s “higher-criticism” statistic and the Berk-Jones statistic.

MSC:

62G10 Nonparametric hypothesis testing
62E20 Asymptotic distribution theory in statistics
60F15 Strong limit theorems
62G20 Asymptotic properties of nonparametric inference
62G15 Nonparametric tolerance and confidence regions
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Abrahamson, I. G. (1967). Exact Bahadur efficiencies for the Kolmogorov–Smirnov and Kuiper one- and two-sample statistics. Ann. Math. Statist. 38 1475–1490. · Zbl 0157.48003 · doi:10.1214/aoms/1177698702
[2] Ali, S. M. and Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. J. Roy. Statist. Soc. Ser. B 28 131–142. JSTOR: · Zbl 0203.19902
[3] Anderson, T. W. and Darling, D. A. (1952). Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann. Math. Statist. 23 193–212. · Zbl 0048.11301 · doi:10.1214/aoms/1177729437
[4] Berk, R. H. and Jones, D. H. (1978). Relatively optimal combinations of test statistics. Scand. J. Statist. 5 158–162. · Zbl 0403.62021
[5] Berk, R. H. and Jones, D. H. (1979). Goodness-of-fit test statistics that dominate the Kolmogorov statistics. Z. Wahrsch. Verw. Gebiete 47 47–59. · Zbl 0379.62026 · doi:10.1007/BF00533250
[6] Bickel, P. J. and Rosenblatt, M. (1973). On some global measures of the deviations of density function estimates. Ann. Statist. 1 1071–1095. · Zbl 0275.62033 · doi:10.1214/aos/1176342558
[7] Cai, T. T., Jin, J. and Low, M. G. (2005). Estimation and confidence sets for sparse normal mixtures. Ann. Statist. · Zbl 1360.62113 · doi:10.1214/009053607000000334
[8] Cayón, L., Jin, J. and Treaster, A. (2005). Higher criticism statistic: Detecting and identifying non-Gaussianity in the WMAP first-year data. Monthly Notices Royal Astronomical Soc. 362 826–832.
[9] Chernoff, H. (1952). A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Statist. 23 493–507. · Zbl 0048.11804 · doi:10.1214/aoms/1177729330
[10] Cressie, N. and Read, T. R. C. (1984). Multinomial goodness-of-fit tests. J. Roy. Statist. Soc. Ser. B 46 440–464. JSTOR: · Zbl 0571.62017
[11] Csiszár, I. (1963). Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten. Magyar Tud. Akad. Mat. Kutató Int. Közl. 8 85–108. · Zbl 0124.08703
[12] Csiszár, I. (1967). Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hungar. 2 299–318. · Zbl 0157.25802
[13] D’Agostino, R. B. and Stephens, M. A. (1986). Goodness-of-fit Techniques . Dekker, New York. · Zbl 0597.62030
[14] Darling, D. A. and Erdös, P. (1956). A limit theorem for the maximum of normalized sums of independent random variables. Duke Math. J. 23 143–155. · Zbl 0070.13806 · doi:10.1215/S0012-7094-56-02313-4
[15] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Statist. 32 962–994. · Zbl 1092.62051 · doi:10.1214/009053604000000265
[16] Durbin, J., Knott, M. and Taylor, C. C. (1975). Components of Cramér–von Mises statistics. II. J. Roy. Statist. Soc. Ser. B 37 216–237. JSTOR: · Zbl 0335.62032
[17] Eicker, F. (1979). The asymptotic distribution of the suprema of the standardized empirical processes. Ann. Statist. 7 116–138. · Zbl 0398.62014 · doi:10.1214/aos/1176344559
[18] Einmahl, J. H. J. and McKeague, I. W. (2003). Empirical likelihood based hypothesis testing. Bernoulli 9 267–290. · Zbl 1015.62048 · doi:10.3150/bj/1068128978
[19] Groeneboom, P. and Shorack, G. R. (1981). Large deviations of goodness of fit statistics and linear combinations of order statistics. Ann. Probab. 9 971–987. · Zbl 0473.60035 · doi:10.1214/aop/1176994268
[20] Ingster, Y. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distributions. Math. Methods Statist. 6 47–69. · Zbl 0878.62005
[21] Ingster, Y. I. (1998). Minimax detection of a signal for \(l^ n\)-balls. Math. Methods Statist. 7 401–428. · Zbl 1103.62312
[22] Jaeschke, D. (1979). The asymptotic distribution of the supremum of the standardized empirical distribution function on subintervals. Ann. Statist. 7 108–115. · Zbl 0398.62013 · doi:10.1214/aos/1176344558
[23] Jager, L. (2006). Goodness-of-fit statistics based on phi-divergences. Technical report, Dept. Statistics, Univ. Washington.
[24] Jager, L. and Wellner, J. A. (2004). A new goodness of fit test: The reversed Berk–Jones statistic. Technical report, Dept. Statistics, Univ. Washington. · doi:10.1214/lnms/1196285400
[25] Jager, L. and Wellner, J. A. (2004). On the “Poisson boundaries” of the family of weighted Kolmogorov statistics. In A Festschrift for Herman Rubin (A. DasGupta, ed.) 319–331. IMS, Beachwood, OH. · Zbl 1268.62043 · doi:10.1214/lnms/1196285400
[26] Jager, L. and Wellner, J. A. (2006). Goodness-of-fit tests via phi-divergences. Technical report, Dept. Statistics, Univ. Washington. · Zbl 1126.62030 · doi:10.1214/0009053607000000244
[27] Janssen, A. (2000). Global power functions of goodness of fit tests. Ann. Statist. 28 239–253. · Zbl 1106.62329 · doi:10.1214/aos/1016120371
[28] Jin, J. (2004). Detecting a target in very noisy data from multiple looks. In A Festschrift for Herman Rubin (A. DasGupta, ed.) 255–286. IMS, Beachwood, OH. · Zbl 1268.94013 · doi:10.1214/lnms/1196285396
[29] Kallenberg, O. (1997). Foundations of Modern Probability . Springer, New York. · Zbl 0892.60001 · doi:10.1007/b98838
[30] Khmaladze, E. V. (1998). Goodness of fit tests for “chimeric” alternatives. Statist. Neerlandica 52 90–111. · Zbl 0953.62042 · doi:10.1111/1467-9574.00070
[31] Khmaladze, E. and Shinjikashvili, E. (2001). Calculation of noncrossing probabilities for Poisson processes and its corollaries. Adv. in Appl. Probab. 33 702–716. · Zbl 1158.60365 · doi:10.1239/aap/1005091361
[32] Liese, F. and Vajda, I. (1987). Convex Statistical Distances . Teubner, Leipzig. · Zbl 0656.62004
[33] Meinshausen, N. and Rice, J. (2006). Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses. Ann. Statist. 34 373–393. · Zbl 1091.62059 · doi:10.1214/009053605000000741
[34] Nikitin, Y. (1995). Asymptotic Efficiency of Nonparametric Tests . Cambridge Univ. Press. · Zbl 0879.62045 · doi:10.1017/CBO9780511530081
[35] Noé, M. (1972). The calculation of distributions of two-sided Kolmogorov–Smirnov type statistics. Ann. Math. Statist. 43 58–64. · Zbl 0238.62047 · doi:10.1214/aoms/1177692700
[36] Owen, A. B. (1995). Nonparametric likelihood confidence bands for a distribution function. J. Amer. Statist. Assoc. 90 516–521. JSTOR: · Zbl 0925.62170 · doi:10.2307/2291062
[37] Révész, P. (1982/83). A joint study of the Kolmogorov–Smirnov and the Eicker–Jaeschke statistics. Statist. Decisions 1 57–65. · Zbl 0567.62013
[38] Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics . Wiley, New York. · Zbl 1170.62365
[39] Vajda, I. (1989). Theory of Statistical Inference and Information . Kluwer, Dordrecht. · Zbl 0711.62002
[40] Wellner, J. A. (1977). Distributions related to linear bounds for the empirical distribution function. Ann. Statist. 5 1003–1016. · Zbl 0368.62027 · doi:10.1214/aos/1176343955
[41] Wellner, J. A. (1977). A Glivenko–Cantelli theorem and strong laws of large numbers for functions of order statistics. Ann. Statist. 5 473–480. · Zbl 0365.62045 · doi:10.1214/aos/1176343844
[42] Wellner, J. A. (1978). Limit theorems for the ratio of the empirical distribution function to the true distribution function. Z. Wahrsch. Verw. Gebiete 45 73–88. · Zbl 0382.60031 · doi:10.1007/BF00635964
[43] Wellner, J. A. and Koltchinskii, V. (2003). A note on the asymptotic distribution of Berk–Jones type statistics under the null hypothesis. In High Dimensional Probability III (J. Hoffmann-Jørgensen, M. B. Marcus and J. A. Wellner, eds.) 321–332. Birkhäuser, Basel. · Zbl 1042.62009
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.