Challenges in game data science

One of the central challenges in game data science is dealing with the high dimensionality of behavioral telemetry, which varies from genre to genre. Consider, for example, a typical Massively Multiplayer Online Game (MMOG); there are thousands of potential features that can be captured for each individual player, nonplayer character, mobs, systems, and economies. This then makes it hard to find behavioral patterns that can help us gain insights to derive design, business, or marketing. The existence of such high-dimensional datasets is a common problem within the field of game data science. Game data scientists deal with this problem in several ways.

Ways to abstract data

One way is to abstract the data as discussed in the previous chapter, in which techniques such as feature extraction, feature selection, or feature engineering are used. However, feature engineering can be problematic if we don’t have a good idea of how the data is structured. One way to address this problem is to use clustering methods, which brings us to the focus of this chapter.

Clustering methods

Clustering methods offer a way to explore datasets and discover patterns that can reduce the overall complexity of the data. In particular, clustering refers to the task of grouping elements in a set to form subsets of closely related elements, called clusters. Using clustering methods for analysis is formally known as cluster analysis. The outcome of this analysis is cluster models. Such models are used when running exploratory analysis, as defined earlier. They can also be used for hypothesis testing. However, classification methods, which we’ll discuss in the next chapter, are more commonly used for hypothesis testing and prediction analysis.

Get hands-on with 1200+ tech skills courses.