Evolving Keepaway Players


Complex control tasks can often be solved by decomposing them into hierarchies of manageable subtasks. Such decompositions require designers to decide how much human knowledge should be used to help learn the resulting components. On one hand, encoding human knowledge requires manual effort and may constrain the learner's hypothesis space, even to the extent that the best solutions are eliminated. On the other hand, these same constraints may make learning easier and enable the learner to tackle more complex tasks.

Our work examines the impact of this trade-off in tasks of varying difficulty. Hence, we explore a space laid out by two dimensions: 1) degree of constraint and 2) task difficulty. In particular, we enhance the neuroevolution learning algorithm with three different methods for learning the components that result from a task decomposition. The first method, coevolution, is mostly unconstrained by human knowledge. The second method, layered learning, is highly constrained. The third method, concurrent layered learning, is a novel combination of the first two that attempts to exploit human knowledge while retaining some of coevolution's flexibility.

This page depicts the results of the application of these three approaches on two versions of a complex task, namely robot soccer keepaway, that differ in difficulty of learning. These results confirm that, given a suitable task decomposition, neuroevolution can master difficult tasks. Furthermore, they demonstrate that the appropriate level of constraint depends critically on the difficulty of the problem.


The following sets of videos show the performance of different methods in two versions of the keepaway task. In keepaway, three agents (called "keepers") attempt to keep the ball away from another agent (called the "taker") for as long as possible. A game of keepaway starts with the keepers in possesion of the ball, and the taker in the middle of the circular field. The game ends if the keeper touches the ball, the ball leaves the field, or time expires.

Tabula Rasa (blank slate) Learning

The most obvious way to approach the problem of training keepaway players is to simply learn a mapping from inputs to outputs. Unfortunately, current learning algorithms (in this case, neuroevolution) are not strong enough to learn much with this strategy:

Before training After training

Learning with a task decomposition

One way to introduce human expertise into an otherwise unlearnable task is to provide a task decomposition, and have the learning algorithm learn pieces of the task rather than the whole task. Coevolution, Layered Learning, and Concurrent Layered Learning are three methods for learning the components of a task decomposition. The following series of videos depicts the before-and-after performance of these three methods against a opponent moving at full speed.

Coevolution: Before training After training

Layered Learning: Before training After training

Concurrent Layered Learning: Before training After training

All three methods perform better than the Tabula Rasa approach. Coevolution and Concurrent Layered Learning are particularily effective, compared to Layered Learning, because they do not overly constrain the learning process. In this version of the keepaway task, Coevolution requires the least human effort to implement and provides the best results.

Learning a more difficult task with a task decomposition

The task of playing keepaway can be made more difficult by not only requiring the learning algorithm to learn pieces of the task, but also to learn how to assemble those pieces together into a single strategy. In order to elucidate the differences between the three methods in this harder task, the following three sets of videos show the performace of each method against an opponent moving at half speed.

Coevolution: Before training After training

Layered Learning: Before training After training

Concurrent Layered Learning: Before training After training

In this more difficult version of the task, the relative performance of the three methods changes significantly. Coevolution, which performed so well previously, now has trouble learning even the simplest strategy. Concurrent Layered Learning, which strikes a balance between constraint and flexability, now yields the best results.


For more information, see our paper Evolving Keepaway Soccer Players through Task Decomposition (2003)