Search⌘ K

Exercise: Indexing with Splink

Explore how to set up indexing with Splink to efficiently deduplicate data by applying SQL-based blocking rules. Learn to reduce comparison pairs in entity resolution tasks using Python, improving data accuracy and performance.

We'll cover the following...

Let’s take one extra step in our Splink deduplication setup.

Task

The solvers_kitchen/restaurants.csv dataset is available in the environment and contains duplicates. ...