×

Hybrid Dirichlet mixture models for functional data. (English) Zbl 1248.62079

Summary: In functional data analysis, curves or surfaces are observed, up to measurement error, at a finite set of locations, for, say, a sample of \(n\) individuals. Often, the curves are homogeneous, except perhaps for individual-specific regions that provide heterogeneous behaviour (e.g., ’damaged’ areas of irregular shape on an otherwise smooth surface). Motivated by applications with functional data of this nature, we propose a Bayesian mixture model, with the aim of dimension reduction, by representing the sample of \(n\) curves through a smaller set of canonical curves. We propose a novel prior on the space of probability measures for a random curve which extends the popular Dirichlet priors by allowing local clustering: non-homogeneous portions of a curve can be allocated to different clusters and the n individual curves can be represented as recombinations (hybrids) of a few canonical curves. More precisely, the prior proposed envisions a conceptual hidden factor with \(k\)-levels that acts locally on each curve. We discuss several models incorporating this prior and illustrate its performance with simulated and real data sets. We examine theoretical properties of the proposed finite hybrid Dirichlet mixtures, specifically, their behaviour as the number of the mixture components goes to \(\infty\) and their connection with Dirichlet process mixtures.

MSC:

62G99 Nonparametric inference
62F15 Bayesian inference

Software:

fda (R)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Antoniak, Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems, Ann. Statist. 2 pp 1152– (1974) · Zbl 0335.60034
[2] Ashburner, Computer-assisted imaging to assess brain structure in healthy and diseased brains, Lancet Neurol. 2 pp 79– (2003)
[3] Berti, Almost sure convergence of random probability measures, Stochastics 78 pp 91– (2006) · Zbl 1100.60025
[4] Bigelow, Bayesian semiparametric joint models for functional predictors, J. Am. Statist. Ass. 104 pp 26– (2009) · Zbl 1388.62181
[5] Bush, A semi-parametric Bayesian model for randomized block designs, Biometrika 83 pp 275– (1996)
[6] Duan, Generalized spatial Dirichlet process models, Biometrika 94 pp 809– (2007) · Zbl 1156.62064
[7] DuBois Bowman, Spatio-temporal modelling of localized brain activity, Biostatistics 6 pp 558– (2005) · Zbl 1169.62366
[8] Dunson, Nonparametric Bayes local partition models for random effects, Biometrika (2008) · Zbl 1163.62084
[9] Dunson, Kernel stick-breaking processes, Biometrika 95 pp 307– (2008) · Zbl 1437.62448
[10] Dunson, The matrix stick-breaking process: flexible Bayes meta analysis, J. Am. Statist. Ass. 103 pp 317– (2008)
[11] Ferraty, Nonparametric Functional Data Analysis: Theory and Practice (2006) · Zbl 1119.62046
[12] Friston, Posterior probability maps and SPMs, Neuroimage 19 pp 1240– (2003)
[13] Gelfand, Bayesian nonparametric spatial modeling with Dirichlet processes mixing, J. Am. Statist. Ass. 100 pp 1021– (2005) · Zbl 1117.62342
[14] Green, Hidden Markov models and disease mapping, J. Am. Statist. Ass. 97 pp 1055– (2002) · Zbl 1046.62117
[15] Griffin, Order-based dependent Dirichlet processes, J. Am. Statist. Ass. 101 pp 179– (2006) · Zbl 1118.62360
[16] Ishwaran, Gibbs sampling methods for stick-breaking priors, J. Am. Statist. Ass. 96 pp 161– (2001) · Zbl 1014.62006
[17] Ishwaran, Some further developments for stick-breaking priors: finite and infinite clustering and classification, Sankhya A 65 pp 577– (2003) · Zbl 1193.62106
[18] Ishwaran, Dirichlet prior sieves in finite Normal mixtures, Statist. Sin. 12 pp 941– (2002) · Zbl 1002.62028
[19] Kingman, Random discrete distributions (with discussion), J. R. Statist. Soc. B 37 pp 1– (1975) · Zbl 0331.62019
[20] MacEachern, Dependent nonparametric process, Proc. Bayesn Statist. Sci. Sect. Am. Statist. Ass. pp 50– (1999)
[21] MacEachern, Bayesian Methods with Applications to Science, Policy, and Official Statistics pp 551– (2001)
[22] MacEachern, Bayesian Statistics 8 (2007)
[23] Marin, Bayesian Approach: a Practical Approach to Computational Bayesian Statistics (2007)
[24] Muliere (1995)
[25] Müller, Optimal sample size for multiple testing: the case of gene expression microarrays, J. Am. Statist. Ass. 99 pp 990– (2004) · Zbl 1055.62127
[26] Neal (1997)
[27] Oakley, Bayesian inference for the uncertainty distribution of computer model outputs, Biometrika 89 pp 769– (2002)
[28] Pitman, Statistics, Probability and Game Theory; Papers in Honor of David Blackwell pp 245– (1996)
[29] Ramamoorthi, Bayesian Statistics and Its Applications pp 385– (2006)
[30] Ramsay, Functional Data Analysis (2005) · Zbl 1079.62006 · doi:10.1002/0470013192.bsa239
[31] Rasmussen, Gaussian Processes for Machine Learning (2006) · Zbl 1177.68165
[32] Ray, Functional clustering by Bayesian wavelet methods, J. R. Statist. Soc. B 68 pp 305– (2006) · Zbl 1100.62058
[33] Rodriguez (2008)
[34] Sethuraman, A constructive definition of Dirichlet priors, Statist. Sin. 4 pp 639– (1994) · Zbl 0823.62007
[35] Shi, Curve prediction and clustering with mixtures of Gaussian process functional regression models, Statist. Comput. 18 pp 267– (2008)
[36] Teh, Hierarchical Dirichlet processes, J. Am. Statist. Ass. 101 pp 1566– (2006) · Zbl 1171.62349
[37] Worsley, Analysis of fMRI time-series revisited-again, Neuroimage 2 pp 173– (1995)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.