The Open-Source TEXPLORE Code Release for Reinforcement Learning on Robots (2013)
The use of robots in society could be expanded by using reinforcement learning (RL) to allow robots to learn and adapt to new situations on-line. RL is a paradigm for learning sequential decision making tasks, usually formulated as a Markov Decision Process (MDP). For an RL algorithm to be practical for robotic control tasks, it must learn in very few samples, while continually taking actions in real-time. In addition, the algorithm must learn efficiently in the face of noise, sensor/actuator delays, and continuous state features. In this paper, we present the TEXPLORE ROS code release, which contains TEXPLORE, the first algorithm to address all of these challenges together. We demonstrate TEXPLORE learning to control the velocity of an autonomous vehicle in real-time. TEXPLORE has been released as an open-source ROS repository, enabling learning on a variety of robot tasks.
In Sven Behnke and Arnoud Visser and Rong Xiong and Manuela Veloso, editors, RoboCup-2013: Robot Soccer World Cup {XVII}, 2013. Springer Verlag.

Todd Hester todd [at] cs utexas edu
Peter Stone pstone [at] cs utexas edu