Usage of the Markov Decision Process in Games

Learn the usage of MDP in games.

We'll cover the following

MDP in current commercial games

Oh et al. (2014) constructed this algorithm for developing assistants in scenarios where goals and plans may be known a priori, where the number of goals is constrained, and planned protocols describe how to act in each given situation. For instance, they posited that this could be used in emergencies with codified response protocols.

The question is: how does this technique fare in games where the users come in with many different goals and optimize on a larger set of rewards and where plans may not be known a priori? This approach may seem too simplistic for current commercial games as it assumes a small state and action space.

Current games are usually far too complex, with thousands of states and actions or more. Furthermore, we’ve investigated using a similar approach with a very simple game with mixed results. The results don’t show great accuracy for the prediction of strategies due to players’ variation and opportunistic behaviors, which tend to be situational. And even a game with a very simple state space can be intractable. We’ll discuss such an example below.

Applying this approach to games

We used this approach to understand players’ strategies and problem-solving patterns as they played a game called WuzzitTrouble. WuzzitTrouble is a commercial game developed by BrainQuake and released in 2013. It’s designed to teach arithmetic by providing symbolic and narrative-based arithmetic puzzles.

The goal of the game is to free creatures called Wuzzits from traps by collecting all the keys on a level. Players can collect keys by moving cogs to the right position on the large wheel, as shown in the screenshot. For example, the game starts with the marker at the number 0. It needs to be moved to number 20 and to number 50 to obtain both the keys needed to free the trapped Wuzzit. Players accomplish this by rotating the large wheel clockwise or counterclockwise using the gears below. The distance, or the number of units, moved by the large wheel depends on the gears. Each small cog can be turned up to five times to generate a five-step turn of the wheel, offering up to five opportunities to collect a key (or another item) with a single move. This is a critical gameplay mechanic to learn in order to free the Wuzzit with the smallest number of moves.

Get hands-on with 1200+ tech skills courses.