A Markov Decison Process (MDP) is a markov process that is partly influenced by decision and partly by stochasticity.

A MDP has:

  • : set of states; the state space
  • : set of actions; the action space
  • : probability a transition happens from to due to action
  • : immediate reward/payoff/value due to state transition

mathcs