An $\ell_{1}$-oracle inequality for the lasso in finite mixture Gaussian regression models. (English)

ESAIM, Probab. Stat. 17, 650-671 (2013).

Summary: We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an $\ell_{1}$-penalized maximum likelihood estimator. We shall provide an $\ell_{1}$-oracle inequality satisfied by this Lasso estimator with the Kullback-Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the $\ell_{1}$-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its $\ell_{1}$-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for $\ell_{1}$-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik’s method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].