Search⌘ K
AI Features

Targeting Uncertainty with Active Learning

Explore how active learning techniques help improve entity resolution models by identifying uncertain record pairs for labeling. Learn to efficiently use limited labeling budgets to enhance classification, particularly on challenging edge cases, ensuring better recall and overall performance in business data deduplication scenarios.

The costs in real-world entity resolution can grow out of control for several reasons, one being creating labels for model training. The number of available record pairs to choose from for labeling is massive. How can we make the best out of our typically limited budget?

In this lesson, we start from an existing training dataset and explore how to add a small batch of labels, keeping the added costs low while maximizing the classification model performance improvements.

Training with existing labels

...