Data vs. Model Bias
Learn about primary bias sources.
We'll cover the following...
Introduction to bias sources
Before diving into bias mitigation methods, it’s essential to analyze potential sources of bias. Understanding these sources will enable us to make informed decisions about selecting appropriate metrics and fixing methods. We can broadly categorize the primary sources of bias in machine learning into two categories:
Data bias: When the data fed to the model is biased, the model reproduces these biases.
Model bias: When the model or modeling technique itself introduces bias.
Data bias
Let’s begin by examining data bias:
Sampling bias: This occurs when data is collected in a non-representative manner, and the sample distribution does not represent the population distribution. For example, an internet ...