Competition Between Reinforcement Learning Methods in a Predator-Prey Grid World (2008)
Tabular and linear function approximation based variants of Monte Carlo, temporal difference, and eligibility trace based learning methods are compared in a simple predator-prey grid world from which the prey is able to escape. These methods are compared both in terms of how well they lead a prey agent to escape randomly moving predators, and in terms of how well they do in competition with each other when one agent controls the prey and each of the predators is controlled by a different type of agent. Results show that tabular methods, which must use a partial state representation due to the size of the full state space, actually do surprisingly well against linear function approximation methods, which can make use of a full state representation and generalize their behavior across states.
View:
PDF
Citation:
Technical Report AI08-9, The University of Texas at Austin, Department of Computer Sciences, November 2008.
Bibtex:

Jacob Schrum Ph.D. Alumni schrum2 [at] southwestern edu