T. Galstyan and H. Khachatrian
|In many applications of machine learning, it is desirable to have models which not only have good accuracy on the prediction task but are also “fair” with respect to some protected variable. One approach to achieve fairness is to learn an invariant representation of the data with respect to that variable and then learn the predictor on top of the representation. Recently, an information-theoretic approach called DSF (Discovery and Separation of Features) was introduced, which demonstrated
strong results in cases where the label and the protected variable are independent. In this paper we extend the model to work in cases when the protected variable is correlated with the label. We perform experiments on a small image classification dataset and show that our model enables significantly better tradeoffs between accuracy and fairness.