Reweighing

Learn how to implement one of the most universal bias mitigation algorithms.

We'll cover the following...

What is reweighing?

Reweighing is a preprocessing bias mitigation method. The main goal is to assign greater weights to underrepresented samples, which modifies the model to consider them more meaningful. Because the number of minority samples is lower, combining them with bigger weights balances the result, making it equally important for all subgroups.

Example

Let’s see how it works with an example. Imagine we have two possible classes (admitted, rejected) and three groups. Due to different population sizes, many models might prefer to learn the correct behavior for majorities, which can result in discrimination against smaller subgroups.

First, we need to group training examples by a selected sensitive attribute. Let’s focus on the Space University admission problem. Our protected attribute is species, and the target is admission status. For each group, we count the number of observations with respect to the target class. We use two classes here, but there can be more; the procedure is exactly the same. Grouping results in a table like this:

Admission Dataset

Sensitive attribute (species)

Admitted

Rejected

Human

30

200

Xeno

300

100

Ferro

80

10

We can easily spot that the number of rejected Ferros is much lower than the others (both as an absolute value and percentage). This could possibly lead to the model favoring them (as a Ferro is unlikely to be rejected). On the other hand, only a tiny fraction of humans is admitted. We clearly see that there is a significant imbalance in the dataset. Even worse, different points in various directions. Reweighing will help models consider each cell in the table as equally important.

Our table contains six cells with numbers (as there are six combinations of species/admission status). Therefore, we need to compute six weights. All examples from a specific partition will receive the same weight. Let’s calculate the weight for the first cell: admitted humans. We use the following formula:

Where:

  • W(G,C ...