Maintenance of a Long Running Distributed Genetic Programming System for Solving Problems Requiring Big Data (2014)
Babak Hodjat, Erik Hemberg, Hormoz Shahrzad, Una-May O’Reilly
We describe a system, ECStar, that outstrips many scaling aspects of extant genetic programming systems. One instance in the domain of financial strategies has executed for extended durations (months to years) on nodes distributed around the globe. ECStar system instances are almost never stopped and restarted, though they are resource elastic. Instead they are interactively redirected to different parts of the problem space and updated with up-to-date learning. Their non-reproducibility (i.e. single “play of the tape” process) due to their complexity makes them similar to real biological systems. In this contribution we focus upon how ECStar introduces a provocative, important, new paradigm for GP by its sheer size and complexity. ECStar’s scale, volunteer compute nodes and distributed hub-and-spoke design have implications on how a multi-node instance is managed. We describe the set up, deployment, operation and update of an instance of such a large, distributed and long running system. Moreover, we outline how ECStar is designed to allow manual guidance and re-alignment of its evolutionary search trajectory.
In Riolo, R., Moore, J., Kotanchek, M., editors, Genetic Programming Theory and Practice XI, University of Michigan, Ann Arbor, USA, May 2014. Springer, New York, NY..

Hormoz Shahrzad Masters Student hormoz [at] cognizant com