×

Discretization procedures for adaptive Markov control processes. (English) Zbl 0677.93073

Summary: This paper presents finite-state discretization procedures for discrete- time, infinite horizon, adaptive Markov control processes which depend on unknown parameters. The discretizations are combined with a consistent parameter estimation scheme to obtain uniform approximations to the optimal value function and asymptotically optimal adaptive control policies. The results include adaptive control systems with unknown disturbance distribution.

MSC:

93E25 Computational methods in stochastic control (MSC2010)
93C40 Adaptive control/observation systems
60J05 Discrete-time Markov processes on general state spaces
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bellman, R.; Dreyfus, S. E., Applied Dynamic Programming (1962), Princeton Univ. Press: Princeton Univ. Press Princeton, NJ · Zbl 0106.34901
[2] Bertsekas, D. P., Convergence of discretization procedures in dynamic programming, IEEE Trans. Automat. Control, 20, 415-419 (1975) · Zbl 0311.90079
[3] Bertsekas, D. P., Dynamic Programming: Deterministic and Stochastic Models (1987), Prentice-Hall: Prentice-Hall Englewood Cliffs, NJ · Zbl 0649.93001
[4] Bertsekas, D. P.; Shreve, S. A., Stochastic Optimal Control: The Discrete Time Case (1978), Academic Press: Academic Press New York · Zbl 0471.93002
[5] Billingsley, P., Weak Convergence of Probability Measures (1968), Wiley: Wiley New York · Zbl 0172.21201
[6] Cavazos-Cadena, R., Finite-state approximations and adaptive control of discounted Markov decision processes with unbounded rewards, Control Cybernet., 16 (1987), in press · Zbl 0678.93065
[7] Di Masi, G.; Runggaldier, W. J., An approach to discrete-time stochastic control problems under partial observation, SIAM J. Control. Optim., 25, 38-48 (1987) · Zbl 0615.93078
[8] Dynkin, E. B.; Yushkevich, A. A., Controlled Markov Processes (1979), Springer-Verlag: Springer-Verlag New York/Berlin · Zbl 0073.34801
[9] Federgruen, A.; Schweitzer, P. J., Nonstationary Markov decision problems with converging parameters, J. Optim. Theory Appl., 34, 207-241 (1981) · Zbl 0426.90091
[10] Gaenssler, P.; Stute, W., Empirical processes: A survey for i.i.d. random variables, Ann. Probab., 7, 193-243 (1979) · Zbl 0402.60031
[11] Georgin, J. P., Estimation et contrôle des chaînes de Markov sur des espaces arbitraires, (Lecture Notes in Math., Vol. 636 (1978), Springer-Verlag: Springer-Verlag New York/Berlin), 71-113 · Zbl 0372.60094
[12] Haurie, A.; L’Ecuyer, P., Approximation and bounds in discrete event dynamic programming, IEEE Trans. Automat. Control, 31, 227-235 (1986) · Zbl 0592.90092
[13] Hernández-Lerma, O., Approximation and adaptive policies in discounted dynamic programming, Bol. Soc. Mat. Mexicana, 30, 25-35 (1985) · Zbl 0641.90087
[14] Hernández-Lerma, O.; Cavazos-Cadena, R., Continuous dependence of stochastic control models on the noise distribution, Appl. Math. Optim., 17, 79-89 (1988) · Zbl 0639.93068
[15] Hernández-Lerma, O.; Marcus, S. I., Adaptive control of discounted Markov decision chains, J. Optim. Theory Appl., 46, 227-235 (1985) · Zbl 0543.90093
[16] Hernández-Lerma, O.; Marcus, S. I., Adaptive policies for discrete-time stochastic control systems with unknown disturbance distribution, Systems Control Lett., 9, 307-315 (1987) · Zbl 0637.93075
[17] Himmelberg, C. J.; Parthasarathy, T.; Van Vleck, F. S., Optimal plans for dynamic programming problems, Math. Oper. Res., 1, 390-394 (1976) · Zbl 0368.90134
[18] Hinderer, K., On approximate solutions of finite-stage dynamic programs, (Puterman, M. L., Dynamic Programming and Its Applications (1978), Academic Press: Academic Press New York), 289-317 · Zbl 0461.90075
[19] Kolonko, M., Strongly consistent estimation in a controlled Markov renewal model, J. Appl. Probab., 19, 532-545 (1982) · Zbl 0489.90078
[20] Langen, H. J., Convergence of dynamic programming models, Math. Oper. Res., 6, 493-512 (1981) · Zbl 0496.90085
[21] Mandl, P., Estimation and control in Markov chains, Adv. in Appl. Probab., 6, 40-60 (1974) · Zbl 0281.60070
[22] Morin, T. L., Computational advances in dynamic programming, (Puterman, M. L., Dynamic Programming and Its Applications (1978), Academic Press: Academic Press New York), 53-90
[23] Royden, H. L., Real Analysis (1968), Macmillan Co: Macmillan Co New York · Zbl 0197.03501
[24] Schäl, M., Estimation and control in discounted stochastic dynamic programming, Stochastics, 20, 51-71 (1987) · Zbl 0621.90092
[25] Whitt, W., Approximation of dynamic programs, I, Math. Oper. Res., 3, 231-243 (1978) · Zbl 0393.90094
[26] Whitt, W., Approximation of dynamic programs, II, Math. Oper. Res., 4, 179-185 (1979) · Zbl 0408.90082
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.