Evolving Deep LSTM-based Memory networks using an Information Maximization Objective (2016)
Reinforcement Learning agents with memory are constructed in this paper by extending neuroevolutionary algorithm NEAT to incorporate LSTM cells, i.e. special memory units with gating logic. Initial evaluation on POMDP tasks indicated that memory solutions obtained by evolving LSTMs outper- form traditional RNNs. Scaling neuroevolution of LSTM to deep memory problems is challenging because: (1) the fit- ness landscape is deceptive, and (2) a large number of asso- ciated parameters need to be optimized. To overcome these challenges, a new secondary optimization objective is intro- duced that maximizes the information (Info-max) stored in the LSTM network. The network training is split into two phases. In the first phase (unsupervised phase), independent memory modules are evolved by optimizing for the info-max objective. In the second phase, the networks are trained by optimizing the task fitness. Results on two different mem- ory tasks indicate that neuroevolution can discover powerful LSTM-based memory solution that outperform traditional RNNs.
To Appear In Genetic and Evolutionary Computation Conference (GECCO 2016), Colorado, USA, 2016.

Risto Miikkulainen Faculty risto [at] cs utexas edu
Aditya Rawal Ph.D. Alumni aditya [at] cs utexas edu
Evolving Deep LSTMAditya Rawal2015