CERIAS 2025 Annual Security Symposium


2025 Symposium Posters

Posters > 2025

Unlearning Machine Learning Bias using Task Vector Arithmetic


PDF

Primary Investigator:
Romila Pradhan

Project Members
Omkar Pote, Romila Pradhan
Abstract
Machine learning models have become integral to decision-making in fields such as criminal justice, finance, and healthcare. However, these models often inherit biases present in their training data, leading to unfair and unethical outcomes, particularly for marginalized groups. Recent work in the area of natural language processing has hypothesized bias to be a linear subspace in word embeddings. We evaluate the applicability of this concept to model weights of structured data, and introduce a novel task arithmetic based approach to unlearn bias in tabular datasets. Our method selectively unlearns biases introduced during training by fine-tuning a model on high-bias data, computing a bias task vector, and subtracting it from the original model to mitigate unwanted biases. Our evaluations show that this approach is competitive with state-of-the-art bias mitigation techniques, significantly improving fairness on several metrics with minimal accuracy loss.