Markov Decision Processes
The Markov Decision Process (MDP) is the formalism underlying modern value-function based reinforcement learning.