Motivating Clustering
Explore how clustering transforms entity resolution from pairwise matching to collective classification. Understand graph-based methods to resolve conflicts, improve prediction accuracy, and handle dependencies between record matches for practical applications.
We'll cover the following...
A typical entity resolution pipeline starts with preprocessing records
Collective entity resolution goes beyond pairs to improve outcomes from the collective evidence of any number of records. It is about improving the classification accuracy and resolving potential conflicts that would otherwise make the output impractical.
Clusters
Let’s reformulate our resolution task as a clustering problem on graphs. Starting from our pairwise predictions, we create a graph where nodes represent records