neural networks research group
areas
people
projects
demos
publications
software/data
Learning Exploration Strategies in Model-Based Reinforcement Learning (2013)
Todd Hester
and Manuel Lopes and
Peter Stone
Reinforcement learning (RL) is a paradigm for learning sequential decision making tasks. However, typically the user must hand-tune exploration parameters for each different domain and/or algorithm that they are using. In this work, we present an algorithm called LEO for learning these exploration strategies on-line. This algorithm makes use of bandit-type algorithms to adaptively select exploration strategies based on the rewards received when following them. We show empirically that this method performs well across a set of five domains. In contrast, for a given algorithm, no set of parameters is best across all domains. Our results demonstrate that the LEO algorithm successfully learns the best exploration strategies on-line, increasing the received reward over static parameterizations of exploration and reducing the need for hand-tuning exploration parameters.
View:
PDF
,
PS
,
HTML
Citation:
In
The Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
, May 2013.
Bibtex:
@inproceedings{AAMAS13-hester, title={Learning Exploration Strategies in Model-Based Reinforcement Learning}, author={Todd Hester and Manuel Lopes and Peter Stone}, booktitle={The Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS)}, month={May}, url="http://nn.cs.utexas.edu/?AAMAS13-hester", year={2013} }
People
Todd Hester
todd [at] cs utexas edu
Peter Stone
pstone [at] cs utexas edu
Areas of Interest
Reinforcement Learning