Why AI Researchers Love Playing Pac-Man
AI and Games is a crowdfunded YouTube series on the research and applications of AI within video games. The following article is a more involved transcription of the topics discussed in the video linked to above. If you enjoy this work, please consider supporting my future content over on Patreon.
Artificial Intelligence research has shown a small infatuation with the Pac-Man video game series over the past 15 years. But why specifically Pac-Man? What elements of this game have proven interesting to researchers in this time? Let’s discuss why Pac-Man is so important in the world of game-AI research.
For the sake of completes — and in appreciating there is arguably a generation or two not familiar with the game — Puck-Man was an arcade game launched in 1980 by Namco in Japan and renamed Pac-Man upon being licensed by Midway for an American release. The name change was driven less by a need for brand awareness but rather because the name can easily be de-faced to say… something else.
The original game focuses on the titular character, who must consume as many pills as possible without being caught by one of four antagonists represented by ghosts.
The four ghosts: Inky, Blinky, Pinky and Clyde, all attempt to hunt down the player using slightly different tactics from one another. Each ghost has their own behaviour; a bespoke algorithm that dictates how they attack the player.
Players also have the option to consume one of several power-pills that appear in each map. Power-pills allow for the player to not just eat pills but the enemy ghosts for a short period of time.
While mechanically simple when compared to modern video games, it provides an interesting test-bed for AI algorithms learning to play games. The game world is relatively simple in nature, but complex enough that strategies can be employed for optimal navigation. Furthermore, the varied behaviours of the ghosts reinforces the need for strategy; since their unique albeit predictable behaviours necessitate different tactics. If problem solving can be achieved at this level, then there is opportunity for it to scale up to more complex games.
Research in Pac-Man
While Pac-Man research began in earnest in the early 2000’s, work by John Koza (Koza, 1992) discussed how Pac-Man provides an interesting domain for genetic programming; a form of evolutionary algorithm that learns to generate basic programs. The idea behind Koza’s work and later that of (Rosca, 1996) was to highlight how Pac-Man provides an interesting problem for task-prioritisation. This is quite relevant given that we are often trying to balance the need to consume pills, all the while avoiding ghosts or — when the opportunity presents itself — eating them.
About 10 years later, people became more interested in Pac-Man as a control problem. This research was often with the intent to explore the applications of artificial neural networks for the purposes of creating a generalised action policy: software that would know at any given tick in the game what would be the correct action to take. This policy would be built from playing the game a number of times and training the system to learn what is effective and what is not. Typically these neural networks are trained using an evolutionary algorithm, that finds optimal network configurations by breeding collections of possible solutions and using a ‘survival of the fittest’ approach to cull weak candidates.
(Kalyanpur and Simon, 2001) explored how evolutionary learning algorithms could be used to improve strategies for the ghosts. In time it was evident that the use of crossover and mutation — which are key elements of most evolutionary-based approaches — was effective in improving the overall behaviour. However it’s important to note that they themselves acknowledge their work uses a problem domain similar to Pac-Man and not the actual game.
(Gallagher and Ryan, 2003) uses a slightly more accurate representation of the original game. While the screenshot is shown here, the actual implementation only used one ghost rather than the original four. In this research the team used an incremental learning algorithm that tailored a series of rules for the player that dictate how Pac-Man is controlled using a Finite State Machine (FSM). This proved highly effective in the simplified version they were playing.
The use of artificial neural networks - a data structure that mimics the firing of synapses in the brain — was increasingly popular at the time (and once again in most recent research). Two notable publications on Pac-Man are (Lucas, 2005), which attempted to create a ‘move evaluation function’ for Pac-Man based on data scraped from the screen and processed as features (e.g. distance to closest ghost), while (Gallagher and Ledwich, 2007) attempted to learn from raw, unprocessed information.
It’s notable here that the work by Lucas was in fact done on Ms. Pac-Man rather than Pac-Man. While perhaps not that important to the casual observer, this is an important distinction for AI researchers.
Ms. Pac-Man: A Brief History and Overview
Research in the original Pac-Man game caught the interest of the larger computational and artificial intelligence community. You could argue it was due to the interesting problem that the game presents or that a game as notable as Pac-Man was now considered of interest within the AI research community. While it is now something that appears commonplace, games — more specifically video games — did not receive the same attention within AI research circles as they do today.
As high-quality research in AI applications in video games grew, it wasn’t long before those with a taste for Pac-Man research moved on to looking at Ms. Pac-Man given the challenges it presents — which we are still conducting research for in 2017.
Ms. Pac-Man is odd in that it was originally an unofficial sequel: Midway, who had released the original Pac-Man in the United States, had become frustrated at Namco’s continued failure to release a sequel. While Namco did in time release a sequel dubbed Super Pac-Man, which in many ways is a departure from the original, Midway decided to take matters into their own hands.
Ms. Pac-Man was — for lack of a better term — a mod; originally conceived by the General Computing Company based in Massachusetts. GCC had got themselves into a spot of legal trouble with Midway having previously created a mod kit for popular arcade game Missile Command. As a result, GCC were essentially banned from making further mod kits without the original game’s publisher providing consent. Despite the recent lawsuit hanging over them, they decided to show Midway their Pac-Man mod dubbed Crazy Otto, who liked it so much they bought it from GCC, patched it up to look like a true Pac-Man successor and released it in arcades without Namco’s consent (though this has been disputed).
Note: For our younger audience, mod kits in the 1980s were not simply software we could use to access and modify parts of an original game. These were actual hardware: printed circuit boards (PCBs) that could either be added next to the existing game in the arcade unit, or replace it entirely. While nowhere near as common nowadays due to the rise of home console gaming, there are many enthusiasts who still use and trade PCBs fitted for arcade gaming.
Ms. Pac-Man looks very similar to the original, albeit with the somewhat stereotypical bow on Ms. Pac-Man’s hair/head(?) and a couple of minor graphical changes. However the sequel also received some small changes to gameplay that have a significant impact.
One of the most significant changes is that the game now has four different maps. In addition the placement of fruit is more dynamic and they move around the maze. Lastly, a small change is made to the ghost behaviour such that, periodically, the ghosts will commit a random move. Otherwise, they will continue to exhibit their prescribed behaviour from the original game.
Each of these changes has a significant impact on both how humans and AI subsequently approach the problem.
The Impact of New Mazes
Changes made to the maps do not have a significant impact upon AI approaches. For many of the approaches discussed earlier, it is simply another configuration of the topography used to model the maze. Or if the agent is using more egocentric models for input (i.e. relative to the Pac-Man) then these is not really considered given the input is contextual. This is only an issue should the agent’s design require some form or pre-processing or expert rules that are based explicitly upon the configuration of the map.
With respect to a human, this is also not a huge task. The only real issue is that a human would have become accustom to playing on a given map; devising strategies that utilise parts of the map to good effect. However, all they need is practice on the new maps. In time, new strategies can be formulated.
The Impact of Altered Ghost Behaviour
The small change to ghost behaviour, which results in random moves occurring periodically, is highly significant. This is due to the fact that the deterministic model that the original game has is completely broken.
Previously, each ghost had a prescribed behaviour, you could — with some computational effort — determine the state (and indeed the location) of a ghost at frame n of the game, where n is a certain number of steps ahead of the current state. Any implementation that is reliant upon this knowledge, whether it is using it as part of a heuristic, or an expert knowledge base that gives explicit instructions based on the assumption of their behaviour, is now sub-optimal. If the ghosts can make random decisions without any real warning, then we no longer have the same level of confidence in any of our ghost-prediction strategies.
Similarly, this has an impact on human players. The deterministic behaviour of the ghosts in the original Pac-Man, while complex, can eventually be recognised by a human player. This has been recognised by the leading human players who could factor their behaviour at some level into their decision making process. However, in Ms. Pac-Man, the change to a non-deterministic domain has a similar effect to humans as it does AI: we can no longer say with complete confidence what the ghosts will do given they can make random moves.
Research in Ms. Pac-Man
Evidence that a particular type of problem or methodology has gained some traction in a research community can be found in competitions. If a competition exists that is open to the larger research community it is, in essence, a validation that this problem merits consideration. In the case of Ms. Pac-Man, there have been two competitions.
Ms. Pac-Man AI Competition
The first competition was organised by Simon Lucas — at the time a professor at the University of Essex in the UK — with the first competition held at the Conference on Evolutionary Computation (CEC) in 2007. It was subsequently held at a number of conferences — notably IEEE Conference on Computational Intelligence and Games (CIG) — until 2011.
This competition used a screen capture approach previously mentioned in (Lucas, 2005) that was reliant on an existing version of the game. While the organisers would use Microsoft’s own version from the ‘Revenge of Arcade‘ title, you could also use the likes the webpacman for testing, given it was believed to run the same ROM code. As shown in the screenshot, the code is actually taking information direct from the running game. One benefit of this approach is that it denies the AI developer from accessing the code to potentially ‘cheat’: you can’t access source code and make calls to the likes of the ghosts to determine their current move. Instead the developer is required to work with the exact same information that a human player would. A video of the winner from the IEEE CIG 2009 competition, ICE Pambush 3, can be seen in the video below:
Ms. Pac-Man vs Ghosts Competition
In 2011, Simon Lucas in conjunction with Philipp Rohlfshagen and David Robles created the Ms Pac-Man vs Ghosts competition. In this iteration, the ‘screen scraping’ approach had been replaced with a Java implementation of the original game. This provided an API to develop your own bot for competitions. This iteration ran at four conferences between 2011 and 2012.
One of the major changes to this competition is that you can now also write AI controllers for the ghosts. Competitors submissions were then pitted against one another. The ranking submission for both Ms. Pac-Man and the ghosts from the 2012 league is shown below.
During the earlier competition, there was a continued interest in the use of learning algorithms. This ranged from the of an evolutionary algorithm — which we had seen in earlier research — to evolve code that is the most effective at this problem. This ranged from evolving ‘fuzzy systems’ that use a rules driven by fuzzy logic (yes, that is a real thing) shown in (Handa, 2008), to the use of influence maps in (Wirth, 2008) and a different take that uses ant colony optimisation to create competitive players (Emilio et al, 2010).
This research also stirred interest from researchers in reinforcement learning: a different kind of learning algorithm that learns from the positive and negative impacts of actions.
Note: It has been argued that reinforcement learning algorithms are similar to that of how the human brain operates, in that feedback is sent to the brain upon committing actions. Over time we then associate certain responses with ‘good’ or ‘bad’ outcomes. Placing your hand over a naked flame is quickly associated as bad given that it hurts!
Simon Lucas and Peter Burrow took to the competition framework as means to assess whether reinforcement learning, specifically an approach called Temporal Difference Learning, would yield stronger returns than evolving neural networks (Burrow and Lucas, 2009). The results appeared to favour the use neural nets over the reinforcement learning approach.
Despite that, one of the major contributions Ms. Pac-Man has generated is research into Monte Carlo methods: an approach where repeated sampling of states and actions allow us to ascertain not only the reward that we will typically attain having made an action, but also the ‘value’ of the state. More specifically, there has been significant exploration of whether Monte-Carlo Tree Search (MCTS); an algorithm that assesses the potential outcomes at a given state by simulating the outcome, could prove successful. MCTS has already proven to be effective in games such as Go! (Chaslot et al, 2008) and Klondike Solitaire (Bjarnason et al. 2009). Naturally — given this is merely an article on the subject and not a literature review — we cannot cover this in immense detail. However, there has been a significant number of papers focussed on this approach. For those interested I would advise you read (Browne, et al. 2012) which gives an extensive overview of the method and it’s applications.
One of the reasons that this algorithm proves so useful is that it attempts to address the issue of whether your actions will prove harmful in the future. Much of the research discussed in this article is very good at dealing with immediate or ‘reflex’ responses. However, few would determine whether actions would hurt you in the long term. This is hard to determine for AI without putting some processing power behind it and even harder when working in a dynamic video game that requires quick responses. MCTS has proven useful since it can simulate whether an action taken on the current frame will be useful 5/10/100/1000 frames in the future and has led to significant improvements in AI behaviour.
While Ms. Pac-Man helped push MCTS research, many resarchers have now moved onto the Physical Travelling Salesman Problem (PTSP), which provides it’s own unique challenges due to the nature of the game environment.
Ms. Pac-Man is still to date an interesting research area given the challenge that it presents. We are still seeing research conducted within the community as we attempt to overcome the challenge that one small change to the game code presented. In addition, we have moved on from simply focussing on representing the player and started to focus on the ghosts as well, lending to the aforementioned Pac-Man vs. Ghosts competition.
While the gaming community at large has more or less forgotten about the series, it has had a significant impact on the AI research community. While the interest in Pac-Man and Ms. Pac-Man is beginning to dissipate, it has encouraged research that has provided significant contribution to artificial and computational intelligence in general.
http://www.pacman-vs-ghosts.net/ — The homepage of the competition where you can download the software kit and try it out yourself.
http://pacman.shaunew.com/ — An unofficial remake that is inspired by the aforementioned Pac-Man dossier by Jamey Pittman.
References & Related Reading
(Bjarnason, R., Fern, A., & Tadepalli, P. 2009). Lower Bounding Klondike Solitaire with Monte-Carlo Planning. In Proceedings of the International Conference on Automated Planning and Scheduling, 2009.
(Browne, C., Powley, E., Whitehouse, D., Lucas, S.M., Cowling, P., Rohlfshagen, P., Tavener, S., Perez , D., Samothrakis, S. and Colton, S., 2012) A Survey of Monte Carlo Tree Search Methods, IEEE Transactions on Computational Intelligence and AI in Games (2012), pages: 1–43.
(Burrow, P. and Lucas, S.M., 2009) Evolution versus Temporal Difference Learning for Learning to Play Ms Pac-Man, Proceedings of the 2009 IEEE Symposium on Computational Intelligence and Games.
(Emilio, M., Moises, M., Gustavo, R. and Yago, S., 2010) Pac-mAnt: Optimization Based on Ant Colonies Applied to Developing an Agent for Ms. Pac-Man. Proceedings of the 2010 IEEE Symposium on Computational Intelligence and Games.
(Gallagher, M. and Ledwich, M., 2007) Evolving Pac-Man Players: What Can We Learn From Raw Input? Proceedings of the 2007 IEEE symposium on Computational Intelligence and Games.
(Gallagher, M. and Ryan., A., 2003) Learning to Play Pac-Man: An Evolutionary, Rule-based Approach. Proceedings of the 2003 Congress on Evolutionary Computation (CEC).
(Chaslot, G. M. B., Winands, M. H., & van Den Herik, H. J. 2008). Parallel monte-carlo tree search. In Computers and Games (pp. 60–71). Springer Berlin Heidelberg.
(Handa, H.) Evolutionary Fuzzy Systems for Generating Better Ms. PacMan Players. Proceedings of the IEEE World Congress on Computational Intelligence.
(Kalyanpur, A. and Simon, M., 2001) Pacman using genetic algorithms and neural networks.
(Koza, J., 1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press.
(Lucas, S.M.,2005) Evolving a Neural Network Location Evaluator to Play Ms. Pac-Man, Proceedings of the 2005 IEEE Symposium on Computational Intelligence and Games.
(Pittman, J., 2011) The Pac-Man Dossier. Retrieved from: http://home.comcast.net/~jpittman2/pacman/pacmandossier.html
(Rosca, J., 1996) Generality Versus Size in Genetic Programming. Proceedings of the Genetic Programming Conference 1996 (GP’96).
(Wirth, N., 2008) An influence map model for playing Ms. Pac-Man. Proceedings of the 2008 Computational Intelligence and Games Symposium
Originally published at aiandgames.com on February 10, 2014 — updated to include more contemporary Pac-Man research references.