Imprison Ms. Pac-Man is a variant of the classic arcade game in which threat ghosts are confined to the lair whenever edible ghosts are present. This small change transforms a domain with blended tasks (threat and edible ghosts present at the same time) into one with interleaved tasks. Several types of modular neural networks are evolved. Both split sensors (separate sensors for edible and threat ghosts) and conflict sensors (generic ghost sensors plus an additional sensor indicating if ghosts are currently edible) are evaluated.
Conflict: One Module: Indecisive Behavior
Behavior exhibited by network with one output module using conflict sensors. The agent has trouble behaving intelligently after eating power pills, and does a poor job of eating ghosts. The agent also does a poor job of avoiding ghosts, leading to death at the end of the simulation.
Conflict: Two Modules: Threat/Edible Split
This two module network discovered a task division in which one module is used when ghosts are edible (green trails), and the other is used otherwise. Evolution discovered this task division without any human intervention. Although the threat/edible split is obvious, performance is merely good, not great. The edible module makes Ms. Pac-Man chase ghosts, but she often fails to eat all of them because they are far away when the ghosts become edible.
Conflict: Two Modules: Luring/Surrounded Module
This two module network evolved an unexpected task division. One module activates when Ms. Pac-Man is nearly surrounded by ghosts at a junction (light blue trails). This module either leads Ms. Pac-Man to safety (end of the video), or toward a power pill. This module helps Ms. Pac-Man eat power pills at opportune times so that she will be able to eat as many ghosts as possible. All other situations are handled by the other module. This luring behavior leads to very high scores.
Split: Two Modules: Luring/Surrounded Module
Even with split sensors, dedicating one module to luring still results in the best scores. This champion develops such a module like the best conflict sensor champions, but also has an easier time chasing ghosts using its other module, because threat and edible ghosts are sensed using different sensors.
Conflict: Module Mutation Duplicate: Confusing Two Module Usage
Module Mutation Duplicate (MM(D)) lets evolution discover how many modules a network should have. MM(D) is a structural mutation operator that creates a new output module by copying an old one. Sometimes sensible divisions are learned, but sometimes the module usage pattern is confusing. This network switches between two modules (red and green trails), but it is not always clear why both are needed. The green module is used when ghosts are edible, but in many other cases as well. Regardless of how these modules are splitting up the task, the learned behavior is highly skilled, since Ms. Pac-Man clears clears all four levels with a high score.
Conflict: Module Mutation Duplicate: Confusing Four Module Usage
This is another example of how MM(D) can learn intelligent behavior, despite a confusing module usage pattern. This network is one of few champions that uses four modules. The agent switches between red and magenta modules very rapidly. Another module (green trails) is also used frequently. However, one must be patient to see the fourth module used (light blue): it is only used in the third maze. Its purpose is to navigate the most dangerous situations in the third maze, which has very long corridors with few safe outlets.
Conflict: Multitask: Threat/Edible Split
Unlike the other task divisions here, Multitask's division is programmed rather than learned. There are two modules: one for edible ghosts (light blue), and another used otherwise. Evolution still has to learn what to do with these modules, and the discovered behavior is pretty good, albeit not as good as the luring behavior learned when there is no human-specified task division.
Split: Three Modules: Luring and Chasing Edible Ghosts in Same Module
This network uses two of its three modules. One is for luring (light blue). However, the same module is also responsible for chasing edible ghosts. This combination of two behavioral modes in one module is made easier by split sensors: threat and edible ghosts are sensed separately, so sensors of only one type affect the module at a given time. Threat ghosts make this module lure, and edible ghosts make it chase.
Split: Multitask: Threat/Edible Split
All multitask networks use the threat (green)/edible (light blue) division, regardless of sensor configuration. This network uses split sensors, making this mandatory module division even more superfluous. The human-specified task division prevents evolution from discovering the better luring module.