An Introduction to Entity Resolution in Python/

...

Targeting Uncertainty with Active Learning

Become familiar with active learning and how it helps improve binary classification.

We'll cover the following...

Training with existing labels
Retraining with additional labels
Key takeaway

The costs in real-world entity resolution can grow out of control for several reasons, one being creating labels for model training. The number of available record pairs to choose from for labeling is massive. How can we make the best out of our typically limited budget?

In this lesson, we start from an existing training dataset and explore how to add a small batch of labels, keeping the added costs low while maximizing the classification model performance improvements.

Training with existing labels

...

Introduction to Entity Resolution and Applications

A Quickstart Guide Using the RecordLinkage Package

Preprocessing

Indexing

Feature Engineering

Pairwise Matching

Clustering

Integration

Entity Resolution Fundamentals

Matching Products Across Two Online Shops

Conclusion

Appendix

Auto-Tagging System for Content Categorization

Targeting Uncertainty with Active Learning

Training with existing labels