Generalized Model Learning for Reinforcement Learning on a Humanoid Robot

Generalized Model Learning for Reinforcement Learning on a Humanoid Robot (2010)

Todd Hester and Michael Quinlan and Peter Stone

Reinforcement learning (RL) algorithms have long beenpromising methods for enabling an autonomous robot to improve itsbehavior on sequential decision-making tasks. The obvious enticement is that the robot should be able to improve its own behavior without theneed for detailed step-by-step programming. However, for RL to reach itsfull potential, the algorithms must be sample efficient: they must learncompetent behavior from very few real-world trials. From thisperspective, model-based methods, which use experiential data moreefficiently than model-free approaches, are appealing. But they oftenrequire exhaustive exploration to learn an accurate model of the domain. In this paper, we present an algorithm, Reinforcement Learning withDecision Trees (RL-DT), that uses decision trees to learn the model by generalizing the relative effect of actions across states. The agentexplores the environment until it believes it has a reasonable policy.The combination of the learning approach with the targeted explorationpolicy enables fast learning of the model. We compare RL-DT againststandard model-free and model-based learning methods, and demonstrate its effectiveness on an Aldebaran Nao humanoid robot scoring goals in apenalty kick scenario.

View:

PDF, PS, HTML

Citation:

In International Conference on Robotics and Automation, 2010.

Bibtex:

People

Todd Hester		todd [at] cs utexas edu
Michael Quinlan		mquinlan [at] cs utexas edu
Peter Stone		pstone [at] cs utexas edu

Projects

TEXPLORE: Real-Time Sample Efficient Reinforcement Learning

Since 2009

Demos

TEXPLORE: Real-Time Sample Efficient Reinforcement Learning

Todd Hester

2012

Areas of Interest

Robotics