×

Convergence of a stochastic approximation version of the EM algorithm. (English) Zbl 0932.62094

Summary: The expectation-maximization (EM) algorithm is a powerful computational technique for locating maxima of functions. It is widely used in statistics for maximum likelihood or maximum a posteriori estimation in incomplete data models. In certain situations, however, this method is not applicable because the expectation step cannot be performed in closed form.
To deal with these problems, a novel method is introduced, the stochastic approximation EM (SAEM), which replaces the expectation step of the EM algorithm by one iteration of a stochastic approximation procedure. The convergence of the SAEM algorithm is established under conditions that are applicable to many practical situations. Moreover, it is proved that, under mild additional conditions, the attractive stationary points of the SAEM algorithm correspond to the local maxima of the function, presented to support our findings.

MSC:

62L20 Stochastic approximation
65C60 Computational problems in statistics (MSC2010)
62F10 Point estimation
62M30 Inference from spatial processes
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Andradottir, S. (1995). A stochastic approximation algorithm with varying bounds. Oper. Res. 1037-1048. JSTOR: · Zbl 0852.90115
[2] Brandiere, O. and Duflo, M. (1996). Les algorithmes stochastiques contournent-ils les pi eges ? Ann. Inst. H. Poincaré 32 395-427. · Zbl 0849.62043
[3] Brocker, T. (1975). Differentiable Germs and Catastrophes. Cambridge Univ. Press.
[4] Celeux, G. and Diebolt, J. (1988). A probabilistic teacher algorithm for iterative maximum likelihood estimation. In Classification and Related Methods of Data Analysis (H. H. Bock, ed.) 617-623. North-Holland, Amsterdam.
[5] Celeux, G. and Diebolt, J. (1992). A stochastic aprroximation type EM algorithm for the mixture problem. Stochastics Stochastics Rep. 41 127-146. · Zbl 0766.62050
[6] Chen, H. F., Guo, L. and Gao, A. J. (1988). Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds. Stoch. Process Appl. 27 217-231. · Zbl 0632.62082
[7] Chickin, D. O. and Poznyak, A. S. (1984). On the asymptotical properties of the stochastic approximation procedure under dependent noise. Automat. Remote Control 44.
[8] Chickin, D. O. (1988). On the convergence of the stochastic approximation procedure under dependent noise. Automat. Remote Control 48.
[9] de Jong, P. and Shephard, N. (1995). The simulation smoother for time series model. Biometrika 82 339-350. JSTOR: · Zbl 0823.62072
[10] Delyon, B. (1996). General results on stochastic approximation. IEEE Trans. Automat. Control. · Zbl 0867.93075
[11] Delyon, B. and Juditski, A. (1992). Stochastic approximation with averaging of trajectories. Stochastics Stochastics Rep. 39 107-118. · Zbl 0765.93081
[12] Doukhan, P. (1994). Mixing: Properties and Examples. Lecture Notes in Statist. Springer, Berlin. · Zbl 0801.60027
[13] Diebolt, J. and Celeux, G. (1996). Asymptotic properties of a stochastic EM algorithm for estimating mixture proportions. Stochastic Models 9 599-613. · Zbl 0783.62021
[14] Diebolt, J. and Ip, E. H. S. (1996). A stochastic EM algorithm for approximating the maximum likelihood estimate. In Markov Chain Monte Carlo in Practice (W. R. Gilks, S. T. Richardson, D. J. Spiegelhalter, eds.). Chapman and Hall, London. · Zbl 0840.62025
[15] Duflo, M. (1997). Random Iterative Models. Springer, Berlin. · Zbl 0868.62069
[16] Dempster, A., Laird, N. and Rubin, D. (1977). Maximum-likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1-38. JSTOR: · Zbl 0364.62022
[17] Fort, J. C. and Pag es, G. (1996). Convergence of stochastic algorithms: from Kushner-Clark theorem to the Lyapounov functional method. Adv. in Appl. Probab. 28 1072-1094. JSTOR: · Zbl 0881.62085
[18] Geyer, C. J. and Thompson, E. A. (1992). Constrained Monte-Carlo maximum likelihood for dependent data. J. Roy. Statist. Soc. Ser. B 54 657-699. JSTOR:
[19] Geyer, C. J. (1994). On the convergence of Monte-Carlo maximum likelihoods calculations J. Roy. Statist. Soc. Ser. B 56 261-274. JSTOR: · Zbl 0784.62019
[20] Geyer, C. J. (1996). Likelihood inference for spatial point processes. In Current Trends in Stochastic Geometry and its Applications (W. S. Kendall, O. E. Barndorff-Nielsen and M. C. van Lieshout, eds.) Chapman and Hall, London. · Zbl 0809.62089
[21] Hall, P. and Heyde, C. C. (1980). Limit Theory and Its Applications. Academic Press, New York. · Zbl 0462.60045
[22] Horn, R. and Johnson, C. (1985). Matrix Analysis. Cambridge Univ. Press. · Zbl 0576.15001
[23] Ibragimov, I. and Has’minski, R. (1981). Statistical Estimation: Asymptotic Theory. Springer, New York.
[24] Kushner, H. and Clark, D. (1978). Stochastic Approximation for Constrained and Unconstrained Systems. Springer, New York. · Zbl 0381.60004
[25] Kushner, H. and Yin, G. (1997). Stochastic Approximation Algorithms and Applications. Springer, Berlin. · Zbl 0914.60006
[26] Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm. J. Roy. Statist. Soc. Ser. B 75 425-437. JSTOR: · Zbl 0813.62021
[27] Liu, C. and Rubin, D. (1994). The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81 633-648. JSTOR: · Zbl 0812.62028
[28] Little, R. J. and Rubin, D. B. (1987). Statistical Analysis with Missing Data. Wiley, New York. · Zbl 0665.62004
[29] Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. J. Roy. Statist. Soc. Ser. B 44 226-233. JSTOR: · Zbl 0488.62018
[30] MacDonald, I. L. and Zucchini, W. (1997). Hidden Markov and Other Models for Discrete-valued Time-series. Chapman and Hall, London. · Zbl 0868.60036
[31] Meilijson, I. (1989). A fast improvement to the EM algorithm on its own terms. J. Roy. Statist. Soc. Ser. B 51 127-138. JSTOR: · Zbl 0674.65118
[32] Meng, X. and Rubin, D. (1993). Maximum likelihood via the ECM algorithm: a general framework. Biometrika 80 267-278. JSTOR: · Zbl 0778.62022
[33] Meng, X. (1994). On the rate of convergence of the ECM algorithm. Ann. Statist. 22 326-339. · Zbl 0803.65146
[34] Murray, G. (1977). Discussion on P. Dempster et al., Maximum-likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. 27-28. JSTOR:
[35] Polyak, B. T. (1990). New stochastic approximation type procedures. Automatica i Telemekh. 98-107. (English translation in Automat. Remote Control 51.) · Zbl 0737.93080
[36] Polyak, B. T. and Juditski, A. B. (1992). Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30 838-855. · Zbl 0762.62022
[37] Rao, C. (1965). Linear Statistical Inference and Its Applications. Wiley, New York. · Zbl 0137.36203
[38] Shephard, N. (1994). Partial non-Gaussian state space. Biometrika 81 115-131. JSTOR: · Zbl 0796.62079
[39] Tanner, M. (1993). Tools for Statistical Inference: Methods for Exploration of Posterior Distributions and Likelihood Functions. Springer, Berlin. · Zbl 0777.62003
[40] Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985). Statistical Analysis of Finite Mixture Distribution. Wiley, New York. · Zbl 0646.62013
[41] Wei, G. and Tanner, M. (1990). A Monte-Carlo implementation of the EM algorithm and the Poor’s Man’s data augmentation algorithm. J. Amer. Statist. Assoc. 85 699-704.
[42] Wu, C. (1983). On the convergence property of the EM algorithm. Ann. Statist. 11 95-103. · Zbl 0517.62035
[43] Younes, L. (1989). Parameter estimation for imperfectly observed Gibbsian fields. Probab. Theory and Related Fields 82 625-645. · Zbl 0659.62115
[44] Younes, L. (1992). Parameter estimation for imperfectly observed Gibbs fields and some comments on Chalmond’s EM Gibbsian algorithm. Stochastic Models, Statistical Methods and Algorithms in Image Analysis. Lecture Notes in Statistics 74. Springer, Berlin. · Zbl 0773.62064
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.