Video Games to Tackle with AI

There is a lot of interesting work on tackling video games with AI from IBM’s early successes with TD-Gammon and Deep Thought/Blue over DeepMind’s Alpha systems to OpenAI’s work on Learning to Play Minecraft with Video PreTraining (VPT) and more (many of which I touch upon in my article on Deep RL). The following article is a fun side-track excursus into video games that would be interesting to tackle with AI, some of which vastly beyond what is currently feasible.

Turing Test

While Portal 2 is more focused on navigating 3D spaces and Q.U.B.E. is more focused on dexterity, this game puts strong emphasis on actual reasoning in relatively confined, well-discretized spaces with very few dexterity challenges. For instance, here is a video of the last puzzle.

In the last puzzle you have a switch that toggles a door lock (i.e. one door is always open, one closed). You also have a magnet and an energy beam powering it – when you interrupt the beam, the magnet turns off. Furthermore, there is a conveyor belt going in a circle passing this energy beam and a metal box. So you notice that if you throw the box onto the conveyor belt, it will interrupt the energy beam at regular intervals. Now the problem is that the switch for the goal door you want to exit through ultimately is behind the door lock and can only be operated by your robot companion you can switch into – all you need to do is flip this switch and exit through the door.

However, in order to do that you have to combine all these ideas and have one a-ha moment: If you place the magnet over the switch and place a metal object (i.e. box or robot) on it and the other one on the conveyor belt, the magnet will pull up the object on the switch most of the time, but drop it and thus trigger the door lock whenever the energy beam is interrupted by the other object on the conveyor belt. The problem is: If your robot is on the conveyor belt or pulled by the magnet, it cannot simultaneously pass the door lock. So the a-ha moment is that you yourself have to use your body as a tool. You’re not magnetic, so you cannot be the object on the switch. Hence, you throw the metal box on the switch, jump onto the conveyor belt, switch into the robot, pass the door lock when your human body passes the energy beam (and thus triggers the magnet-door-button-cascade) and can flip the switch to then transform back into yourself and exit. I just loved that puzzle in the context of what we’re doing – the reasoning process I went through doing this would be tremendously cool to reproduce in a software system.

Another aspect I like about the game is that in many corridors there are optional side puzzles. When you solve them, you usually gain entrance to an extra room with additional reading about Turing, Searle and (iirc) Dennett. It is also an interesting setting: The puzzles exist to protect humans from AIs – according to the setting, only human creativity can solve this collection of puzzles. Here is a full walkthrough in case someone wants to see everything:

Scanner Sombre

Essentially, this game is on the list because of the Johansson point light figure and its inspiration for cognitive science. The game is primarily about navigating game a game world comprised of point clouds, but the ending is interesting in the point light figure context.

Gabriel Knight 3

Point & click adventures are not only one of my favorite game genres, but due to their strong puzzle background they are also interesting in terms of AI. While most adventures require the combination of items in the player’s inventory as well as using them with scene objects, there are lots of intriguing puzzle variants and Gabriel Knight 3 utilizes a lot of them. The rough setting is that you are in a hotel with a whole cast of mysterious characters in the French city of Rennes-le-Château. While most of them are on a treasure hunt, it is evident that there has to be much more to it. The game stands in the long templar conspiracy tradition from Umberto Eco’s excellent Foucault’s Pendulum over the rather grotesque book Holy Blood, Holy Grail over another very famous adventure game series called Broken Sword (which is worth a post in itself) to superficial mainstream literature like Dan Brown. But instead of going into the (historically most likely not too interesting story) around Abbe Sauniere whose sudden wealth sparked conspiracy theories around the French city the game is set in, let me first provide the video and then get into the puzzles.

As you can see from glancing over the video, you analyze maps, gather geophysical subsurface imaging data, investigate crime scenes, sneak up on characters to eavesdrop in on conversations, listen with drinking glasses on doors, disguise yourself, illegally enter hotel rooms to investigate them, explore a largely open world, analyze paintings, decipher a cryptic poem (Le Serpent Rouge), play with multiple characters, interpret text, conversations and the actions of other agents in the environment and you learn about the rich history of the setting which includes the extensive use of a wiki to explore new topics among many other elements. While this game is still clearly out of reach for contemporary AI agents, it does provide an interesting example of which kinds of challenges I would ultimately like to solve. While playing arcade games – e.g. with DQNs like the early Atari 2600 work by Mnih et al. – is intriguing [later DeepMind’s Agent57 even reached superhuman performance on all Atari 2600 games and Gato, A Generalist Agent which Scott Reed et al. released just yesterday (fun fact: I sat at Scott Reed’s prior desk at UMich for almost 2 years) looks marvelous as well], it does not bridge the gap to higher levels of reasoning, whereas point and click adventures require a fluidly intelligent solution that dynamically combines various forms of skills and spectrums of reasoning which to me feels like, pun intended, a holy grail of AI. In the AI Painter experiment I gave a brief example of what a very simple first step on tackling adventures might look like against the much simpler Maniac Mansion, but Gabriel Knight 3 might be the ultimate challenge for adventure playing AIs. If I ever find some free time, I would love to play with it – after all, that’s what games are for.


I am a huge fan of Chris Franklin’s outstanding channel Errant Signal for his insightful analyses of video games as an art form and his amazing curation leading to the selection of quite a few unusual games. One such unusual game is Hyberbolica which takes place in non-Euclidean / hyperbolic space, see Poincaré disk model. While slightly disorienting and counterintuitive, humans can quickly adapt to such “distortions” and I would expect a robust AI to be able to do the same trick.

Prey (Original)

Speaking of interesting perspectives the original Prey game from 2006 was a treasure trove of innovative gameplay. Starting with its inversion of gravity with which made you walk on ceilings and walls both during puzzle sections and during combat over its portal mechanics a year before Portal to an interesting section where you find a medium-sized sphere (smaller than the protagonist), walk through a portal and are subsequently shrunk down and teleported onto said sphere seeing the room around you [see video below]. There are also paths which go up the walls and over the ceilings you can keep walking on meaning you walk up walls and on the ceiling. In addition, there is a spirit mode that allows you to leave your body and traverse areas otherwise inaccessible. A human exhibits the fluid intelligence necessary to seamlessly adapt to such crazy mechanics and it would be fascinating to see an AI that can do the same. Other games have had partially similar mechanics – Super Mario Galaxy, for instance, has spherical planets, the grotesquely immature Duke Nukem Forever also shrinks down the player and Alien vs Predator allowed the player to walk up walls and on the ceiling when playing as the alien. However, Prey sticks out in its plethora of novel elements and their successful integration into one coherent experience.

Deus Ex

While I do not get to play many games anymore, 0451 games / immersive sims are clearly the genre I align with the most. What I find so intriguing about them is the combination of shooter mechanics, RPG and adventure elements, open world hubs and the capability to approach levels in entirely different ways both in terms of violence vs stealth, but also regarding finding actual paths through the game. Playing Deus Ex for the first time – being given a choice of weapons, reminded of being a police unit thus discouraging blatant violence and being put on a huge map where it is the player’s choice whether to enter through the front door or find an alternative entrance, hack doors and security systems, pick locks, crawl through air vents, lay out traps, hijack security robots, disarm mines etc. was truly groundbreaking at the time. Recognizing options and adapting to the situation at hand as well as the skills and inventory available would be ultra interesting to explore for an AI. I have worked on theory of mind approaches and goal inference before – inferring the intents of agents in the environment, e.g., in order to sneak around them, would be a start. I am very much looking forward to one day being able to approach such settings. Also note that there is a lot of meta discussion in this game – an AI (or technology augemented humans) as deus ex machina, the questions what makes humans human, the risks of big data etc. Unlike later instances of the title, the first game raises many questions in generally the right way, asks whether AIs could be useful in politics to derive objective decisions beyond human flaws like a desire for power and it includes lots of semi-philosophical conversations such as this one. While it stays on a mainstream video game level and I should write a proper post on the philosophy behind AI such as the Chinese Room, John Searle, Karl Popper etc., it is a nice addition that certainly supports the setting.

Another intriguing angle is speedruns. While you can finish the game about twice as fast, this one is well-explained and thus illustrates which kinds of glitches one can leverage in order to beat the game much faster than possible by traditional means. On the one hand, AIs already have a natural tendency to exploit shortcuts as the CoastRunners RL agent showed which would run in circles rather than winning the race in order to maximize its points, but on the other a lot more consideration goes into planning human speedruns, so there should be a lot of ground left to explore.


For similar reasons to why DeusEx is such a fascinating game, Hitman is an incredible series, in particular Blood Money and the triology. In Hitman the player is, well, a hitman who has to eliminate targets, but rather than doing so with brutality the game is very much a puzzle sandbox where understanding options in the environment as well as using items and disguises to one’s advantage is key. While you could technically shoot a target, in practice you will poison food, sabotage golf balls to explode on impact, drop chandeliers, make bridges collapse etc. And you will do so while being disguised as a janitor, personal bodyguard, waiter, hotelier, business partner etc. An AI that could observe the environment that well, devise a plan and follow through with all the uncertainty of the billions of situations that can arise, would be an amazing piece of software.

Outer Wilds

I’m a bit uncertain whether to include Outer Wilds, but what I find compelling about it in this context is that in it you are thrown into a solar system that blows up and by exploring it repeatedly you build an understanding of the underlying rules and reasons of the environment. Leveraging the new knowledge in each iteration to come up with new ideas in an intrinsically motivated exploration and investigating small phenomena to solve the big mystery seems highly interesting to figure out.

Tactic Shooters

Tactic shooters could be interesting to tackle with AI [if there were no ethical concerns, more below], since they require combining a lot of information – besides the quick reactions needed for any shooter, they require reconnaissance, teamwork including good comms, task planning, navigation, adapting your strategy to your objectives and various other aspects. The video below is Ghost Recon Breakpoint, but there are various alternatives including its predecessor Wildlands as well as GROUND BRANCH, Insurgency Sandstorm, Ready or Not and Zero Hour. Probably the best example I am aware of is Arma 3 due to its realism and plethora of elements. Its developer Bohemia Interactive is also well-known for actual tactical training simulations like VBS4.

These games largely focus on infantry, usually with small commando teams, but can also touch upon vehicles including aircraft and boats. However, it should be noted that there are interesting games and simulators specializing in each of these. For example, regarding aviation games like Digital Combat Simulator World (DCS World), Microsoft Flight Simulator (which Lockheed’s Prepar3D is based on) or X-Plane are worth a look and regarding submarines Silent Hunter is a good option. However, since I am less interested in special purpose AIs and more in systems that exhibit AI that is as broad as possible, games like Arma are closer to my heart. That some reviewers on Steam have invested over 30,000 hours into this game shows how much depth it has. From what I hear Arma 4 is now in development, so there will be even more great options in the future. However, ethics is of course a major concern with anything war-related. While these games are interesting from a purely technical perspective, in real-life one would have to conduct a very thorough ethical analysis before starting such a project and it might even be safer to categorically ignore them due to their massive inherent risk. Primum non nocere.


FEZ is one of the most interesting games I have ever played. It is a jump and run on voxel levels and you can rotate the camera to look at it from four directions (front, back, left, right). What makes this interesting is that while your kinematics are 2D – you walk on lines like you would in Super Mario, the 2D objects like floors or climbable walls change with the view. So you can not only reveal additional ways to navigate through the world, but also use this mechanic in between jumps to access regions you would not otherwise be able to access which leads to spatial puzzle solving that is unique and a lot of fun. Humans adapt to this surprisingly well and it would be great to see an AI do the same thing.

Crazy Machines

Crazy Machines is similar to the The Incredible Machine game series where you need to complete Rube Goldberg machines in order solve a particular given task. This combination of symbolic puzzling with understanding the dynamics of an environment w.r.t. the individual game elements (and their physical properties) is challenging to both humans and AI.

Dark Signs

Dark Signs is a rather old hacking game and when I last played it over a decade ago it was quite bug-ridden. However, overall it was a great experience and it is one of the most unique games I have played up to the point that I implemented my own simple version as a kid. What set it apart from most other hacking simulator games for at least a decade is that it has a semi-serious scripting language [maybe Bitburner is now beginning to catch up to it, but I haven’t had time to look into it, yet] and this scripting language can be used in any way you like – it is literally a sandbox or even an immersive sim for hacking. For instance, you can write your own port scanner with it and since it has delays, scanning an entire IP range will take a while. The same goes for password crackers. And just like in real life, it might be a good idea to try a dictionary attack. Some puzzles require more social engineering and you actually try via active or passive reconnaissance to find password hints etc., but I fondly remember one of the more challenging puzzles where you would find a compromised large chunk of code in Dark Signs’ scripting language and had to reverse engineer it to solve your task which required reading the code and drawing your flow diagrams until you had the solution. An AI that could start to perform this kind of reasoning would be quite fascinating.

Lost Vikings

Lost Vikings is a jump and run game with three vikings who have unique skills that need to be combined to traverse a level. I like again that it combines understanding the dynamics of the environment and parcours which an RL agent should be easily capable of, but also require the more symbolic planning to solve puzzle elements while coordinating your agents, erm, vikings and keeping track of how they need to interact to make progress together which seems closer to automated planning. [You can actually download this game for free from Blizzard. Not sure about part 2.]


The final game in this theme and series I would like to mention is Gunpoint which is described as “a game of creative infiltration” in its launch trailer. Besides an interesting jump and climb mechanic that enables the necessary 2D spatial navigation, you can rewire things in the environment, so motion sensors, doors, electricity etc. work differently. You can use this to your advantage by improvising traps, concealing your presence, gaining access to restricted areas etc. An AI that could observe and understand a building like in Gunpoint to then draw its decision which elements to manipulate to accomplish its mission would be phenomenal.

I understand that the AI skills required to tackle many of the games I’ve presented are currently far beyond the SOTA, but I think this shouldn’t prevent us from investigating how to combine different AI paradigms via neuro-symbolic approaches, but also via hybrid AI systems that combine automated planning, reasoning, neural elements, RL, traditional ML etc. to converge towards agents capable of the fluid intelligence required to tackle such complex tasks. First I enjoyed playing these games myself and now I am having additional fun contemplating how to solve them with AI one day. Maybe a sufficiently advanced AI needs to read books and play video games just like we do to challenge itself and achieve fluidity in thought and action.


Superliminal is a game out of CMU that plays with perspective a lot. For instance, it turns the visual perception clue that objects are smaller when they are further away on its head: If you pickup an object and while holding it move it over a background far away, it will actually scale that object to be as big as it would have to be in the distance in order to appear as having the same size, e.g., if you pick up a tiny chess piece, hold it in front of your eye over a room of the wall far away and let go of it, a massive chess piece will drop down. That alone is interesting, but it really becomes intriguing when you scale something like a puppet house which you can actually enter, since then you will actually shrink or grow your own size as you walk through it. But there are also other vignettes on perspective like projected cubes which are smeared across surfaces and appear misscaled and broken unless you look at them from exactly the right spatial point at which point in time you can pick them up as normal 3D geometry and quite a few additional elements.

It should also be noted that the game has a predecessor called the Museum of Simulation Technology which is less polished, but equally worth a look:

Manifold Garden

Manifold Garden is an interesting puzzle platformer, since it takes place in a mathematical space with endlessly repeating geometry. This lets you jump into infinity to land on the same geometry further down, so if there is something in front of you, but too far up to jump across, you can just look down, see it repeated below you and jump on it there. Furthermore, it is another game that lets you switch gravity allowing you to walk on walls and ceilings and together with the mathematically abstract art style that very much dissolves our usual notion of what is up or down making it entirely depend on your frame of reference.

Tool Assisted Speedruns

This might be more a statement at a meta level, but I think TAS (tool-assisted speedruns) raise two interesting points regarding our topic: i) Humans cannot react as quickly as machines, but if you even the playing field by giving them tools which allow snapshotting and thus alleviate the real-time constraints, humans become much more impressive at fast, agility-driven games like jump-and-runs and ii) the exploration and exploitation of glitches we see in speedruns is another interesting aspect underexplored in AI research. While there are famous examples like the speedboat game where the agent learned to drive in circles to maximize its score rather than win the race, I still find that humans are much more strategic in searching for and cataloging glitches to then optimally align them into one coherent speedrun of a game.


While incredibly boring for humans and something I would never play, games like Draw Puzzle deserve mentioning since i) drawing is still an underexplored modality and ii) while finding the things to fill in is trivial for humans, it is a good stepping stone for AI. For instance, filling in parts like a missing shoe or a hole in a plane or donut are reminiscent of masked language training and there are some other fluid intelligence aspects as well – for instance, when you temporarily need to switch to mathematical reasoning. There are also physics drawing games where you draw bridges or objects that fall, but my impression is that there is a lot of untapped potential in this genre and current instantiations still rely on crudely handling position and shape, since semantically interpreting the drawn object is most likely AI-hard itself.

Building Machines, Plants and Environments

I don’t want to spend too much time on this, since it is well known, but the plethora of games in which you build machines or plants or construct things in your environment lend themselves well to AI optimization. For instance, in Instruments of Destruction one can optimize the construction of vehicles based on their destructive potential.

A good example of the creative process free-form construction enables is exhibited in the following video as well as many similar ones. Just given a variable-length plank the player explores the design space it enables and finds that besides building giant constructs like turning his raft into a huge tornado, he can use it to travel over the map and under water without oxygen limitation, avoid environment triggers, become invisible to enemies, glitch through walls and ceilings etc.

In other videos he experiments with accelerating components and plant machinery to massively craft and refine materials. This fluid reuse reminds me of human tool usage. Just like a knife is not only essential for food prep and eating [or maybe self-defense in the worst case], but can be used for batoning (i.e., fire making), digging, tightening screws, carving, opening boxes, cleaning fish and all kinds of other usages easily forgotten in industrialized living, an inconspicuous plank can be much more useful in a video game than merely bridging a gap.

Finally, I should refer to all the games with programmable elements like LogicBots which comes with a lots of sensors, logic gates, functional gates, actuators etc. to enable players to build their robots. I have to admit I haven’t played it, yet, since I have access to real robots, but it is conceptually interesting despite its mediocre reviews. I should also point out that logic gates have found their way into all kinds of construction games including Minecraft, Starbound and Terraria (afair).

Time Travel

Usually, I am not a big fan of time travel, but when done right reasoning across time rather than primarily spatially is a refreshing change of scenery. One of the earliest instances of reasoning over time periods I have experienced was Day of Tentacle, the successor of Maniac Mansion, in which you play with three different characters in the same house in the presence, past and future due to an experiment gone wrong. Since it is a point & click adventure there are lots of puzzles which incorporate the element. One character is stuck in a tree in the future, so when you get someone in the past (well, not “someone”- George Washington, but that is another story) to chop down that tree, it will vanish in the future thus freeing your other character. Or when you get a bottle of wine included in a time capsule and open that in the future, you obtain vinegar when you open that capsule in the future. And similarly when you mess with US history in major ways like changing the flag design and signing into law that every household is required to own a vacuum, you can solve puzzles in the future by having a tentacle-shaped flag and a vacuum at your disposal just to name a few examples. You also freeze a hamster for the future, send a contract via pony express in the past and dry a pullover for a very long time until it has shrunk to fit the unfrozen hamster in the future among others. In short: You constantly need to consider not only what you can do right now, but how you can impact the future and since you can send items between time zones the puzzles become quite interesting.

Another notable use of time travel as a puzzle element came out of the Portal community which as part of the Workshop extensions experimented with portals through time which culminated in the introduction of time portals in Portal Reloaded as explained below:

While this might seem trivial at first glance, it certainly adds to the puzzles and means that not only can you use decay in the future to your advantage, but also that you can leverage future cubes in the presence as shown below:

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply