Audit Methodology: How to Check a Dataset

Section 1

Not Optional

Disaggregated metrics with Fairlearn

python

import pandas as pd
from fairlearn.metrics import MetricFrame, selection_rate, false_positive_rate
from sklearn.metrics import accuracy_score

df = pd.read_csv('loan_predictions.csv')
# Required columns: y_true, y_pred, gender, race

metrics = {
    'accuracy': accuracy_score,
    'selection_rate': selection_rate,
    'false_positive_rate': false_positive_rate,
}

mf = MetricFrame(
    metrics=metrics,
    y_true=df['y_true'],
    y_pred=df['y_pred'],
    sensitive_features=df[['gender', 'race']]
)

print(mf.by_group)
print('Max-min accuracy gap:',
      mf.difference(method='between_groups'))

Key terms in this lesson

Audit Methodology: How to Check a Dataset

Not Optional

A six-step audit process

Tools of the trade

A concrete audit snippet

What to include in an audit report

Curious about “Audit Methodology: How to Check a Dataset”?

Keep going

Audit Methodology: How to Check a Dataset

Not Optional

A six-step audit process

Tools of the trade

A concrete audit snippet

What to include in an audit report

Curious about “Audit Methodology: How to Check a Dataset”?

Keep going