Search⌘ K
AI Features

ML Bias in Models

Learn to identify and quantify bias in machine learning models through fairness metrics such as demographic parity difference and equalized odds. This lesson helps you analyze disparities across demographic groups using statistical tests and guides you in applying bias mitigation strategies to enhance model fairness and ethical compliance.

Machine learning models can unintentionally reinforce societal biases if fairness isn't measured. In this lesson, we'll quantify bias using fairness metrics and explore how they apply to important domains like finance and criminal justice. Let’s get started.

Demographic parity analysis

Suppose you have developed a machine learning classifier for loan approval with features including income, credit score, age, and race. Calculate the demographic parity difference for two protected groups (e.g., male and female applicants) if the overall approval rate is 60%, the approval rate for male applicants is 45%, and the approval rate for female applicants is 70%. Discuss how this metric reveals potential bias in the model and suggest potential mitigation strategies.

Python 3.10.4
import numpy as np
from scipy.stats import chi2_contingency
def demographic_parity_analysis(
total_approval_rate=0.60,
female_approval_rate=0.45,
male_approval_rate=0.70
):
#TODO - your implementation
return {
"demographic_parity_difference": ...,
"chi2_statistic": ...,
"p_value": ...,
"bias_threshold": ...,
"is_biased": ...
}
outputs = demographic_parity_analysis()
print(outputs)

Sample answer

Demographic parity difference is a fairness metric used to evaluate whether a machine learning model's predictions are equally distributed across different demographic groups. It measures the difference in the positive prediction rates (e.g., approval rates, acceptance rates) between these groups. The goal is to ensure that the model does not favor one group over another. Let's look at a code ...

Python 3.10.4
import numpy as np
from scipy.stats import chi2_contingency
def demographic_parity_analysis(
total_approval_rate=0.60,
female_approval_rate=0.45,
male_approval_rate=0.70
):
# Calculate demographic parity difference
demographic_parity_diff = abs(male_approval_rate - female_approval_rate)
# Statistical significance test
total_sample_size = 1000
female_sample = np.random.binomial(total_sample_size, female_approval_rate)
male_sample = np.random.binomial(total_sample_size, male_approval_rate)
# Chi-square test for independence
contingency_table = np.array([
[female_sample, total_sample_size - female_sample],
[male_sample, total_sample_size - male_sample]
])
chi2, p_value = chi2_contingency(contingency_table)[:2]
return {
"demographic_parity_difference": demographic_parity_diff,
"chi2_statistic": chi2,
"p_value": p_value,
"bias_threshold": 0.2,
"is_biased": demographic_parity_diff > 0.2
}
outputs = demographic_parity_analysis()
print(outputs)

The function demographic_parity_analysis analyzes potential bias in model predictions by calculating key ...