Grokking the Machine Learning Interview/

...

Training Data Generation

Let's generate training data for the entity linking problem.

We'll cover the following...

Open-source datasets
Human-labeled data

There are two approaches you can adopt to gather training data for the entity linking problem.

Open-source datasets
Manual labeling

You can use one or both depending on the particular task for which we have to perform entity linking.

Open-source datasets

If the task is not extremely domain-specific and does not require very specific tags, you can avail open-source datasets as training data. For example, if you were asked to perform entity linking for a simple chatbot, you could utilize the general-purpose, open-source dataset ...

Introduction

Practical ML Techniques/Concepts

Search Ranking

Feed Based System

Recommendation System

Self-Driving Car: Image Segmentation

Entity Linking System

Ad Prediction System

Training Data Generation

Open-source datasets