One Life Ms. Pac-Man is the classic arcade game modified so that Ms. Pac-Man only has one life. As a result, small mistakes have large negative consequences. This game blends the tasks of dealing with threat and edible ghosts, because it is possible for both types to be present at the same time. Several types of modular neural networks are evolved. Both split sensors (separate sensors for edible and threat ghosts) and conflict sensors (generic ghost sensors plus extra sensors indicating whether each ghost is edible) are evaluated.
Conflict: Two Modules: Threat/Edible Split
A neural network with two modules can still learn, through evolution, to approximate a threat/edible task division, even though it is possible for ghosts of both types to be present at the same time. Whenever Ms. Pac-Man leaves green trails behind, she is using a module that has adapted to eat edible ghosts. However, this module typically gets switched off after three ghosts have been eaten, or if threat ghosts get really close while the remaining edible ghosts are far away.
Conflict: Two Modules: Luring/Surrounded Module
Learning a threat/edible split in the full game results in decent behavior, but the highest scores are earned by exhibiting a luring behavior: one module (light blue trails) leads ghosts to the power pill so that Ms. Pac-Man can eat it and quickly catch the now vulnerable ghosts. Precise timing of power pill eating has a big impact on how many ghosts can possibly be eaten, which is why luring leads to higher scores than a threat/edible split.
Conflict: Module Mutation Duplicate: Luring/Surrounded Module
Module Mutation Duplicate, which creates copies of existing modules, is also able to discover a luring module in Ms. Pac-Man. All networks that discover this crucial module end up using it both for luring and for escaping dangerous situations when surrounded. This particular network tends to use this module (light blue trails) more for escaping when surrounded than for luring. The first use of the module occurs to escape ghosts near the top of the maze, and the two lower power pills get eaten without the module being used at all. However, this module is used to lure ghosts towards the two upper power pills, and the score achieved by the agent is good.
Conflict: Module Mutation Random: Luring/Surrounded Module
MM(R) does not do well on average in the full Ms. Pac-Man game. However, its ability to create new output modules means that it does still sometimes discover a luring module. A luring module does not always emerge, even with the other modular networks, but networks that have a luring module always get the highest scores.
Conflict: Module Mutation Previous: Five Modules
Most MM(P) champions use only two modules, like the other modular methods, but this is the only MM(P) champion that has five modules, four of which are actually used. The red and magenta modules are both used to deal with threats, but it is not clear what the distinction is between one is used vs the other. However, the green module is clearly used for luring, and the cyan module is clearly used to deal with edible ghosts. The luring behavior leads to high performance, but the confusion of the red and magenta modules makes this network achieve lower scores than other luring champions.
Conflict: Module Mutation All: Threat/Edible With Some Confusion
When all three types of Module Mutation are possible, a variety of different outcomes are possible. The pure threat/edible split is often discovered, as is the intelligent luring behavior. However, sometimes no useful division is discovered and only one module is used, often resulting in lower scores. However, this network exhibits an imperfect, somewhat confused version of threat/edible behavior. The green module is mostly used to deal with threats, and the cyan module is mostly used to deal with edible ghosts, but each of these modules is occasionally used briefly at times inconsistent with such functionality. Despite this inconsistency between modules and modes, the learned behavior performs reasonably well.
Conflict: Three Modules: Threat/Edible Split
A three module network evolves to use two modules: one for dealing with edible ghosts, and one for threat ghosts. Of course, since both can be present at the same time, there are some blended situations in between. In these situations, the network will generally favor the edible module if it is not too threatened by predator ghosts.
Conflict: Two Modules: Luring/Surrounded Module
This two module network learns to dedicate a module to luring/surrounded behavior (green trails). However, some power pills are eaten at inappropriate times, such as right before finishing the level, or while edible ghosts are still present. However, the agent still achieves a good score overall and beats all four levels.
Conflict: One Module: Indecision
Networks with only one module often have trouble deciding what to do when the ghosts are edible. Because staying alive is most important, they do a good job of avoiding threat ghosts, but when the ghosts are edible the one module networks still avoid them. However, one module networks still succeed at eating some ghosts, generally by positioning themselves in such a way that the edible ghost is forced to come to them (first two power pills). Sometimes, the presence of threat ghosts will actually drive Ms. Pac-Man towards nearby edible ghosts that she likely wouldn't otherwise have pursued (second two power pills). However, so much fine tuning within a single module is difficult to maintain, which is perhaps what leads to death before completing the first maze.
Conflict: One Module: Luring With One Module
Although luring behavior is difficult to discover without a separate dedicated module, a few One Module networks exhibit this behavior. This particular network is the best-scoring One Module champion out of 20 runs, and although its average score is very high, the modular approaches that have dedicated luring modules attain higher scores. The behavior in this video is slightly different from the behavior of modular networks with dedicated luring modules. The luring behavior itself is quite good, but the ghost-chasing behavior that follows it could be better. This likely explains the score gap between this network and the best modular champions.
Conflict: Two Module Multitask
This Multitask Learning approach used two modules: One whenever there are any edible ghosts present (light blue), and another at all other times. This division results in agents that are okay at eating the edible ghosts around them, but such agents cannot learn a luring module, so there is a lower ceiling on the scores they can achieve. This result is the same with both split and conflict sensors.
Split: Three Module Multitask
This Multitask Learning approach uses a human-specified task division consisting of three modules: one for when only threats are present (light blue), one for when only edible ghosts are present (green), and one for when ghosts of both types are in the maze (red). This approach does not do any better than simply having a single module. In fact, having both a red and green module seems to cause confusion. Additionally, a luring module cannot be learned because of the imposed human task division. This result is the same with both split and conflict sensors.
Split: Module Mutation Duplicate: Luring/Surrounded Module
This MM(D) network with split sensors clears all levels. Notice that eating all edible ghosts when threat ghosts appear becomes very difficult. Notice also that, especially in later levels, the luring module is more often used to escape dangerous situations.
Split: Module Mutation Random: One Module
Module Mutation Random does not do particularly well in the full Ms. Pac-Man game. It has trouble discovering useful modules; this network only possesses a single module. Therefore, it cannot learn to lure well. However, the use of split sensors does make it possible for the network to do a decent job of chasing and eating ghosts that happen to be near when power pills are eaten.
Split: Three Modules: Luring/Surrounded Module
Even in the full game, Three Module networks still favor using just two modules. This one learns a luring module, though it is not always used directly before eating a power pill. In fact, it seems to be used more often to escape tricky situations and avoid death. Use of split sensors helps the network eat ghosts whenever they are edible.
Split: Two Modules: One Module
Two Modules results in the best performance in the full game because most networks discover a luring module. However, not all of them succeed in this way. This particular network only uses one module, even though two are available. Because split sensors are used, there is not much benefit to a threat/edible division, so one module is used for everything. However, the resulting behavior is not particularly good.