Markov decision process
A Markov Decision Process (MDP) is a discrete time stochastic control process characterized by a set of states, actions, and transition probability matrices that depend on the actions chosen within a given state. MDPs are useful for studying a wide range of optimization problems solved via dynamic programming and reinforcement learning.
References Bellman, R. E. Dynamic Programming. Princeton University Press, Princeton, NJ. M. L. Puterman. Markov Decision Processes. Wiley, 1994.
External links MDP Toolbox for Matlab - An excellent tutorial and Matlab toolbox for working with MDPs.
|
|