Usage of the Markov Decision Process in Games

Learn the usage of MDP in games.

We'll cover the following

MDP in current commercial games
Applying this approach to games

MDP in current commercial games

Oh et al. (2014) constructed this algorithm for developing assistants in scenarios where goals and plans may be known a priori, where the number of goals is constrained, and planned protocols describe how to act in each given situation. For instance, they posited that this could be used in emergencies with codified response protocols.

The question is: how does this technique fare in games where the users come in with many different goals and optimize on a larger set of rewards and where plans may not be known a priori? This approach may seem too simplistic for current commercial games as it assumes a small state and action space.

Current games are usually far too complex, with thousands of states and actions or more. Furthermore, we’ve investigated using a similar approach with a very simple game with mixed results. The results don’t show great accuracy for the prediction of strategies due to players’ variation and opportunistic behaviors, which tend to be situational. And even a game with a very simple state space can be intractable. We’ll discuss such an example below.

Applying this approach to games

We used this approach to understand players’ strategies and problem-solving patterns as they played a game called WuzzitTrouble. WuzzitTrouble is a commercial game developed by BrainQuake and released in 2013. It’s designed to teach arithmetic by providing symbolic and narrative-based arithmetic puzzles.

The goal of the game is to free creatures called Wuzzits from traps by collecting all the keys on a level. Players can collect keys by moving cogs to the right position on the large wheel, as shown in the screenshot. For example, the game starts with the marker at the number 0. It needs to be moved to number 20 and to number 50 to obtain both the keys needed to free the trapped Wuzzit. Players accomplish this by rotating the large wheel clockwise or counterclockwise using the gears below. The distance, or the number of units, moved by the large wheel depends on the gears. Each small cog can be turned up to five times to generate a five-step turn of the wheel, offering up to five opportunities to collect a key (or another item) with a single move. This is a critical gameplay mechanic to learn in order to free the Wuzzit with the smallest number of moves.

Get hands-on with 1400+ tech skills courses.

Getting Started

Introduction to Game Data Science

Data Preprocessing

Introduction to Statistics and Probability Theory

Data Abstraction

Data Analysis through Visualization

Clustering Methods in Game Data Science

Supervised Learning in Game Data Science

Model Validation and Evaluation

Introduction to Neural Networks

Sequence Analysis of Game Data

Advanced Sequence Analysis

Case Study: Tom Clancy's The Division (TCTD)

Conclusion and Remarks

Appendix A: Game Used in the Book

Usage of the Markov Decision Process in Games

MDP in current commercial games

Applying this approach to games