Sequence pattern mining is an area of data mining concerned with extracting frequently occurring patterns from sequence data. This approach has been applied in several domains with great success. In marketing studies, the approach is known as item set mining and is used for market analysis. Once extracted, these frequent patterns can tell us which items one person is likely to buy, given that they bought item x.

Example

A good example is: if someone buys a car, then they’re more likely to buy insurance next. For software applications such as web search, it’s beneficial for the system to account for such frequent sequences so it can optimize or dynamically adjust website content.

How can we predict players' actions?

Similarly, one can think of many different uses of such functionality in games. If we know frequent patterns and can identify what the player is doing, then we can predict the probability of subsequent actions and optimize the game design accordingly. This can help us with many important design problems, such as churn, difficulty adjustment, content tuning to maximize engagement or interest, and segmentation of players according to their in-game activity.

Previously, we explored sequences through visualizations. Sequence pattern mining is another approach where we can use frequency to develop a set of patterns of different lengths. For each pattern identified, we can identify the level of support from the data. This is calculated as the frequency of the pattern given the data. Intuitively, this means we can output a set of patterns and determine how frequently these patterns will appear in the data.

What is sequence pattern mining?

There are several methods developed to deduce frequent patterns in a given dataset. In order to understand how these methods work, let’s see an example. The table below shows three sessions from the Dota 2 dataset. Notice that the format in this table is different from the formats we discussed above. Here, we have a SequenceID (SID), which is mapped to the player id in our data. Event ID (EID) denotes the order of events in sequence (SID). Therefore, each player, SID, will have a sequence of events ordered by EID. In the table, SID 1 has four events: Solo, Solo, Fight, and Solo.

Get hands-on with 1200+ tech skills courses.