×

A decomposition algorithm for limiting average Markov decision problems. (English) Zbl 1052.90097

Summary: We consider a Markov Decision Process (MDP) under average reward criterion. We investigate the decomposition of such MDP into smaller MDPs by using the strongly connected classes in the associated graph. Then, by introducing the associated levels, we construct an aggregation-disaggregation algorithm for the computation of an optimal strategy for the original MDP.

MSC:

90C40 Markov and semi-Markov decision processes
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abbad, M.; Daoui, C., Algorithms for aggregated limiting average Markov decision problems, Math. Methods Oper. Res., 53, 3, 451-463 (2001) · Zbl 1036.90073
[2] Avsar, Z. M.; Baykal-Gursoy, M., A decomposition approach for undiscounted two person zero-sum stochastic games, Math. Methods Oper. Res., 49, 3, 483-500 (1999) · Zbl 0942.91012
[3] Blackwell, D., Discrete dynamic programming, Ann. Math. Statist., 33, 719-726 (1962) · Zbl 0133.12906
[4] Derman, C., Finite State Markovian Decision Processes (1970), Academic Press: Academic Press New York · Zbl 0262.90001
[5] Filar, J. A.; Schultz, T. A., Communicating MDPsequivalence and properties, Oper. Res. Lett., 7, 6, 303-307 (1988) · Zbl 0659.90095
[6] M. Gondran, M. Minoux, Graphes et Algorithmes, 2nd Edition, Eyrolles, Paris, 1990.; M. Gondran, M. Minoux, Graphes et Algorithmes, 2nd Edition, Eyrolles, Paris, 1990.
[7] Haviv, M.; Puterman, M. L., An improved algorithm for solving communicating average reward Markov decision, Ann. Oper. Res., 28, 229-242 (1991) · Zbl 0717.90084
[8] Howard, R. A., Dynamic Programming and Markov Processes (1960), Wiley: Wiley New York · Zbl 0091.16001
[9] L.C.M. Kallenberg, Linear Programming and Finite Markovian Control Problems, Mathematical Centre Tracts, Vol. 148, Amsterdam, 1983.; L.C.M. Kallenberg, Linear Programming and Finite Markovian Control Problems, Mathematical Centre Tracts, Vol. 148, Amsterdam, 1983. · Zbl 0503.90061
[10] Ross, K. W.; Varadarajan, R., Multichain Markov decision processes with a sample path constrainta decomposition approach, Math. Oper. Res., 16, 195-207 (1991) · Zbl 0734.90111
[11] White, D. J., Real applications of Markov decision processes, Interfaces, 15, 6, 73-83 (1985)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.