Learning Strategic Behavior in Sequential Decision Tasks
Active from 2009 - 2012
Many routine tasks in the real world can be seen as sequential decision tasks. For instance, navigating a robot through a complex environment, driving a car in congested traffic, and routing packets in a computer network requires making a sequence of decisions that together minimize time and resources used. It would be desirable to automate these tasks, yet it is difficult because the optimal decisions are generally not known. Approximating them by finite-state machines or learning them based on reinforcement leads to reactive behaviors that perform well in short term, but do not amount to intelligent high-level behavior in the long term. The goal of this project is to develop the technology that makes learning such strategic high-level behavior possible.

The main technical challenge is to devise a method that extends sequential decision learning from reactive to strategic behaviors. Such a method needs to be able to (1) retain information from past states, (2) learn multimodal behavior, (3) choose between the different behaviors based on crucial detail, and (4) implement a sequential high-level strategy based on those behaviors. The neuroevolution methods developed in prior work solve the first problem by evolving (through genetic algorithms) recurrent neural networks to represent the behavior. To solve the remaining problems, these methods will be extended with multi-objective optimization, local nodes with cascaded structure, and with evolution of modules and their combinations. Preliminary results indicate that this approach is indeed feasible. In this project, it will be first characterized fully in supervised learning tasks as well as in synthetic sequential decision tasks. It will then be scaled up to a robotic soccer simulation in OpenNERO, and evaluated in two ways: In an objective comparison with other hand-coded and learned soccer teams, and through a subjective analysis (by human evaluators) of the learned strategies. The end result will be a systematic approach to learning strategic high-level behavior in sequential decision tasks.

In the long term, the technology should make it possible to build robust sequential decision systems for real-world tasks. It should lead to safer and more efficient vehicle, traffic, and robot control, improved process and manufacturing optimization, and more efficient computer and communication systems. It should also make the next generation of video games possible, with characters that exhibit realistic, strategic behaviors: Such technology should lead to more effective educational and training games in the future.

This research is supported by the National Science Foundation under grant IIS-0915038.

Risto Miikkulainen Professor risto@cs.utexas.edu
Jacob Schrum Ph.D. Student schrum2@cs.utexas.edu
Nate Kohl Ph.D. Student (Alumni) nate@cs.utexas.edu
Vinod Valsalam Ph.D. Student (Alumni) vkv@alumni.utexas.net
Chern Han Yong Masters Student (Alumni) cherny@nus.edu.sg
Padmini Rajagopalan Ph.D. Student padmini@cs.utexas.edu
Aditya Rawal Ph.D. Student aditya@cs.utexas.edu
Bryan Silverthorn Ph.D. Student (Alumni) bsilvert@cs.utexas.edu
Alan J Lockett Ph.D. Student (Alumni) alan.lockett@gmail.com
     [Expand to show all 13][Minimize]
Constructing Controllers for Physical Multilegged Robots using the ENSO Neuroevolution Approach Vinod K. Valsalam, Jonathan Hiller, Robert MacCurdy, Hod Lipson and Risto Miikkulainen To Appear In Evolutionary Intelligence, 5(1):1--12, 2012. 2012

Measure Theoretic Evolutionary Annealing Alan J. Lockett and Risto Miikkulainen In Proceedings of the 2011 IEEE Congress on Evolutionary Computation, 2011. 2011

Real-Space Evolutionary Annealing Alan J Lockett and Risto Miikkulainen In Proceedings of the 2011 Genetic and Evolutionary Computation Conference (GECCO-2011), 2011... 2011

Constructing Competitive and Cooperative Agent Behavior Using Coevolution Aditya Rawal, Padmini Rajagopalan and Risto Miikkulainen In IEEE Conference on Computational Intelligence and Games (CIG 2010), Copenhagen, Denmark, A... 2010

Evolving Agent Behavior In Multiobjective Domains Using Fitness-Based Shaping Jacob Schrum and Risto Miikkulainen In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2010), 439--446,... 2010

Latent Class Models for Algorithm Portfolio Methods Bryan Silverthorn and Risto Miikkulainen In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010. 2010

Neuroevolution Risto Miikkulainen In Encyclopedia of Machine Learning, New York, 2010. Springer. 2010

Utilizing Symmetry in Evolutionary Design Vinod Valsalam PhD Thesis, Department of Computer Sciences, The University of Texas at Austin, Austin, TX, 2010. Te... 2010

Evolving Neural Networks for Strategic Decision-Making Problems Nate Kohl and Risto Miikkulainen In Neural Networks, Special issue on Goal-Directed Neural Systems, 2009. 2009

Evolving Symmetric and Modular Neural Network Controllers for Multilegged Robots Vinod K. Valsalam and Risto Miikkulainen In xploring New Horizons in Evolutionary Design of Robots: Workshop at the 2009 IEEE/RSJ Internat... 2009

Evolving Symmetric and Modular Neural Networks for Distributed Control Vinod K. Valsalam and Risto Miikkulainen In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), 731--738, 200... 2009

Learning in Fractured Problems for Constructive Neural Network Algorithms Nate Kohl PhD Thesis, Department of Computer Sciences, University of Texas at Austin, Austin, TX, 2009. 2009

Temporal Convolution Machines for Sequence Learning Alan J Lockett and Risto Miikkulainen Technical Report AI-09-04, Department of Computer Sciences, the University of Texas at Austin, 2009. 2009

ENSO This package contains software implementing the ENSO approach for evolving symmetric modular neural networks. It also in... 2010

NEAT C++ The NEAT package contains source code implementing the NeuroEvolution of Augmenting Topologies method. The source code i... 2010

OpenNERO OpenNERO is a general research and education platform for artificial intelligence. The platform is based on a simulatio... 2010

Sorting Networks This package contains software utilizing an approach based on symmetry and evolution to minimize the number of comparato... 2010

rtNEAT C++ The rtNEAT package contains source code implementing the real-time NeuroEvolution of Augmenting Topologies method. In ad... 2006

SANE-C The SANE-C package contains the source code for the Hierarchical SANE system, written in C. This package has been rewrit... 1997