Hall, Peter; Titterington, D. M. On smoothing sparse multinomial data. (English) Zbl 0628.62039 Aust. J. Stat. 29, 19-37 (1987). Let \(p(i)\), \(i=1,...,m\), be multinomial cell probabilities for a given model, and \(\hat p(i)\) the corresponding cell proportion estimators (unsmoothed). A smoothed kernel estimator can be defined by employing information from neighbouring cells. The multinomial is sparse if (in asymptotic arguments) \(\sup_{i}p(i)\leq C\delta\), and \[ \sup_{i}| p(i+j)- \sum^{t-1}_{k=0} \binom{j}{k} \Delta^ kp(i)| \leq C\delta | j\delta |^ t, \] for some constant C and all j; these conditions are used to define a smoothness class \({\mathcal P}_ t\). Optimality of estimation procedures is judged by minimizing mean summed square error, e.g., \(\sum^{m}_{i=1}E\{\hat p(i)-p(i)\}^ 2.\) If the data is not too sparse \((n^{1/(2t+1)}\delta\) is bounded away from 0), then the optimal rate of convergence is that achieved by the unsmoothed cell proportions, namely \(0(n^{-1})\). Otherwise, the rate can be improved by smoothing. Explicit results, including formulae for the optimal smoothing parameter, are presented for a kernel-type estimator. The smoothing parameter is estimated from the data by “least- squares cross-validation”, in which one sample observation is omitted at a time, and the resulting procedure is shown to be asymptotically optimal. Reviewer: R.Mentz Cited in 1 ReviewCited in 24 Documents MSC: 62G05 Nonparametric estimation 62H99 Multivariate analysis Keywords:multinomial cell probabilities; cell proportion estimators; smoothed kernel estimator; smoothness class; minimizing mean summed square error; optimal rate of convergence; unsmoothed cell proportions; optimal smoothing parameter; least-squares cross-validation; asymptotically optimal PDFBibTeX XMLCite \textit{P. Hall} and \textit{D. M. Titterington}, Aust. J. Stat. 29, 19--37 (1987; Zbl 0628.62039) Full Text: DOI