Cavazos-Cadena, Rolando; Montes-De-Oca, Raúl Nearly optimal stationary policies in negative dynamic programming. (English) Zbl 0937.90114 Math. Methods Oper. Res. 49, No. 3, 441-456 (1999). Summary: This work concerns controlled Markov chains with denumerable state space and discrete time parameter. The reward function is assumed to be \(\leq 0\) and the performance of a control policy is measured by the expected total-reward criterion. Within this context, sufficient conditions are given so that the existence of a stationary policy which is \(\varepsilon\)-optimal at every state is guaranteed. MSC: 90C40 Markov and semi-Markov decision processes 90C39 Dynamic programming Keywords:Markov decision processes; expected total-reward criterion; negative rewards; uniformly \(\varepsilon\)-optimal stationary policies PDFBibTeX XMLCite \textit{R. Cavazos-Cadena} and \textit{R. Montes-De-Oca}, Math. Methods Oper. Res. 49, No. 3, 441--456 (1999; Zbl 0937.90114) Full Text: DOI