Paper: Fairness Guarantees under Demographic Shift

Full Abstract: Recent studies found that using machine learning for social applications can lead to injustice in the form of racist, sexist, and otherwise unfair and discriminatory outcomes. To address this challenge, recent machine learning algorithms have been designed to limit the likelihood such unfair behavior occurs. However, these approaches typically assume the data used for training is representative of what will be encountered in deployment, which is often untrue. In particular, if certain subgroups of the population become more or less probable in deployment (a phenomenon we call demographic shift), prior work’s fairness assurances are often invalid. In this paper, we consider the impact of demographic shift and present a class of algorithms, called Shifty algorithms, that provide high-confidence behavioral guarantees that hold under demographic shift when data from the deployment environment is unavailable during training. Shifty, the first technique of its kind, demonstrates an effective strategy for designing algorithms to overcome demographic shift’s challenges. We evaluate Shifty using the UCI Adult Census dataset (Kohavi and Becker, 1996), as well as a real-world dataset of university entrance exams and subsequent student success. We show that the learned models avoid bias under demographic shift, unlike existing methods. Our experiments demonstrate that our algorithm’s high-confidence fairness guarantees are valid in practice and that our algorithm is an effective tool for training models that are fair when demographic shift occurs.

General Summary: Research has shown that machine learning algorithms can produce unfair (e.g., racist or sexist) predictions, spurring the creation of machine learning algorithms with various fairness guarantees. However, prior
fairness aware machine learning algorithms assume that the distribution of people that the machine learning system encounters and makes predictions for when the system is actually deployed will be the same as the distribution of people in the historical data used to train the machine learning system. This is not true, for example, if data is collected in one city, and used to train a machine learning system that will be used to make predictions about people in a different city. In this paper we study this problem, which we call “demographic shift”, and we create the first machine learning algorithms that are provably fair in the presence of demographic shift.

Conference name: International Conference on Learning Representations

Conference abbreviation: ICLR

Link to Paper: https://people.cs.umass.edu/~pthomas/papers/Giguere2022.pdf