What is disparate impact in AI
Disparate impact in AI refers to situations where an algorithm produces outcomes that disproportionately disadvantage certain protected groups, even without explicit bias in the model. It is a key concept in AI ethics and fairness, highlighting indirect discrimination through data or design.Disparate impact is an AI fairness concept that describes when an algorithm unintentionally causes unequal adverse effects on protected groups despite neutral intent.How it works
Disparate impact occurs when an AI system's decisions disproportionately affect certain groups based on sensitive attributes like race, gender, or age, even if those attributes are not explicitly used. This happens because the training data or proxy variables correlate with these groups, causing indirect discrimination. Think of it like a hiring algorithm that favors candidates from certain zip codes, which correlate with socioeconomic status and race, unintentionally disadvantaging minorities.
Concrete example
Consider a loan approval AI model that uses features like income, employment history, and credit score. Suppose the model does not use race explicitly but uses zip code, which correlates with racial demographics. This can cause disparate impact if minority applicants from certain zip codes are denied loans at a higher rate.
from collections import Counter
# Sample loan decisions by zip code
loan_decisions = [
{'zip_code': '12345', 'approved': True},
{'zip_code': '12345', 'approved': False},
{'zip_code': '67890', 'approved': False},
{'zip_code': '67890', 'approved': False},
{'zip_code': '67890', 'approved': True},
]
# Count approvals by zip code
approvals_by_zip = {}
for decision in loan_decisions:
zip_code = decision['zip_code']
if zip_code not in approvals_by_zip:
approvals_by_zip[zip_code] = {'approved': 0, 'total': 0}
if decision['approved']:
approvals_by_zip[zip_code]['approved'] += 1
approvals_by_zip[zip_code]['total'] += 1
# Calculate approval rates
for zip_code, counts in approvals_by_zip.items():
rate = counts['approved'] / counts['total']
print(f"Zip code {zip_code} approval rate: {rate:.2f}") Zip code 12345 approval rate: 0.50 Zip code 67890 approval rate: 0.33
When to use it
Use disparate impact analysis when deploying AI systems in high-stakes domains like hiring, lending, healthcare, or criminal justice to detect indirect discrimination. It is essential when protected group fairness is legally or ethically required. Avoid ignoring it, as unaddressed disparate impact can cause harm and legal liability.
Key terms
| Term | Definition |
|---|---|
| Disparate impact | When an AI system causes disproportionate adverse effects on protected groups without explicit bias. |
| Protected groups | Groups legally protected from discrimination, e.g., race, gender, age. |
| Proxy variable | A feature correlated with a sensitive attribute that can cause indirect bias. |
| Fairness | The principle that AI decisions should not unfairly disadvantage any group. |
Key Takeaways
- Disparate impact reveals indirect discrimination in AI even without explicit bias in features.
- Analyzing approval rates or outcomes by group helps detect disparate impact.
- Address disparate impact proactively in regulated or sensitive AI applications to ensure fairness.