\input zb-basic \input zb-ioport \iteman{io-port 02159039} \itemau{Cao, Xi-Ren} \itemti{A sensitivity view of Markov decision processes and reinforcement learning.} \itemso{Gong, Weibo (ed.) et al., Modeling, control and optimization of complex systems. In honor of Professor Yu-Chi Ho. Papers of the symposium, Cambridge, MA, June 23--24, 2001. Foreword by Christos G. Cassandras. Boston, MA: Kluwer Academic Publishers (ISBN 1-4020-7208-2/hbk). The Kluwer International Series on Discrete Event Dynamic Systems 14, 261-283 (2003).} \itemab Summary: The goals of perturbation analysis (PA), Markov decision processes (MDPs), and reinforcement learning (RL) are common: to make decisions to improve the system performance based on the information obtained by analyzing the current system behavior. In this paper, we study the relations among these closely related fields. We show that MDP solutions can be derived naturally from performance sensitivity analysis provided by PA. Performance potential plays an important role in both PA and MDPs; it also offers a clear intuitive interpretation for many results. Reinforcement learning, $\text{TD}(\lambda)$, neuro-dynamic programming, etc, are efficient ways of estimating the performance potentials and related quantities based on sample paths. This new view of PA, MDPs and RL leads to the gradient-based policy iteration method that can be applied to some nonstandard optimization problems such as those with correlated actions. Sample path-based approaches are also discussed. \itemrv{~} \itemcc{} \itemut{potentials; Poisson equations; gradient-based policy iteration; $Q$-learning, $\text{TD}(\lambda)$} \itemli{} \end