Fair, Accountable and Transparent Machine Learning

In this short paper I introduce the concept of Fair, Accountable and Transparent machine learning (FAT ML). I explain what the problems surrounding FAT ML are, why these problems are important and finally, how these problems can be tackled.

What is the problem?

Fairness, Accountability and Transparency

Machine learning is often used to make important decisions. However, sometimes these algorithms make incorrect decisions and without the ability to understand the algorithm it’s difficult to have anyone held as responsible. A lot of machine learning acts as a black box, we’re unable to see whats going on inside. This is dangerous, considering how much impact these algorithms have on society. There should be someone accountable for these algorithms.

An important part of this process is transparency. The people accountable for an algorithm should be able to reason why an algorithm made a decision. Therefore, we need to know what data the algorithm was trained on and how it came to make that decision using the data.

Furthermore, more reports are made on machine learning models making un-fair or in some cases, discriminatory predictions. At first glance one might think that a machine learning algorithm would be completely unbiased, non-discriminatory and fair in its judgement. However, as described by Karen Hao in an article published by the MIT technology review [1] bias can enter during the training phase of the algorithm in at least two ways. Firstly, if the data is unrepresentative of the reality it is trying to predict, the model will undeniably be biased. As an example, imagine that we train a neural network to do facial recognition and we predominantly train it on a data set containing light-skinned people. In effect, the model would be worse at classifying dark-skinned people. This is one way that bias can creep into a seemingly innocent AI model. Secondly, a dataset can contain underlying prejudice, which is what happened when Amazon tried to create a model to assess candidates in their recruitment process [2]. In this case, a model was trained on recruitment data that historically favored men over women, and in effect the model reflected this discriminatory reality and started to dismiss female candidates.

Why is it a problem?

Without accountability there will be no consequences for decisions made by machine learning algorithms. This can lead to developers disregarding the effects of their algorithms. For example, one of the proposed consequences of social media, smartphones and websites like Youtube is that they are negatively affecting users attention span, which has been declining in the past decade. Companies should be held responsible for the consequences brought about by their machine learning algorithms.

Fundamentally, the problem of transparency occurs because of how large the input and output space of machine learning algorithms are. Because we can’t test all possible inputs to see the results we can’t really understand how the results depend on the inputs. Also, we need to know what data the algorithm was trained on. Customer want to know how their data is used and how much of their privacy is infringed upon. For example, a lot of people don’t know that when they type something into google, even without pressing search, google will save what they typed.

As ML models are being used more and more in large companies, and as ML models’ decisions have more and more influence on our lives, it becomes very important that these models are fair in their predictions. If unfair ML models are used in recruitment processes, for assessing loan eligibility or any other process that influences people on an individual level, this could increase disparities that already exist in society. For example, if a ML model used by a bank favors white loan applicants over dark skinned applicants, utilizing this model would further increase the gap between these demographic groups. So if our goal is to make society more fair, we have to ensure that models being used by companies with a large impact, are fair.

Solutions

New regulations and laws are required to hold companies accountable for the consequences of their use of machine learning algorithms. One important step is to allow public auditing of the system used, they shouldn’t be held secret when they have such a large impact on society.

In order to make ML algorithms more transparent companies have to be able to explain their algorithm. Essential information such as who the stakeholders and end-users are is required. They need to know how the algorithm works. Also, the source of the data and what the data contains has to be disclosed. New methods of model extraction [3] has made it possible to easier understand machine learning algorithms. This method promises to have the same statistical complexity of a normal neural network while being more interoperable.

The issue of biased models mentioned above can become a problem if there are no incentives to make an unbiased model. For any organization the underlying goal is to make a profit, hence if the implementation of a new machine learning model leads to increased profits, it is rather unlikely that a decision maker in the organization will care if the model is biased or not. Therefore, there has to be some incentive to ensure that these models are fair in their predictions. Probably the best way of creating such incentives is through new legislation that regulates and puts demands on the models that companies use.

Conclusions

Laws need to adapt to recent developments in technology in order to handle the consequences brought on by machine learning, there needs to be incentives to motivate organizations to develop ML technology with FAT in mind. While it’s possible to make machine learning more explainable, some parts will always remain a mystery, but it’s possible and necessary to disclose data use for more transparency. Moreover, organizations that are building ML models have to ensure that their data sets are representative of the whole population, and that they are not biased.

References

  1. Karen Hao, This is how AI bias really happens—and why it’s so hard to fix, MIT Technology Review, 2019.

  2. Justin Dastin, Amazon scraps secret AI recruiting tool that showed bias against women, Reuters, 2018

  3. Osbert Bastani, Carolyn Kim, Hamsa Bastani, Interpretability via Model Extraction, FAT/ML 2017

Samuel Tober

Samuel Tober
Msc. student in computer science at the Royal Institute of Technology (KTH) working as a data analyst. Passionate about piano and climbing.

Time Series Modelling with Prophet

Recently I have been working on time series modelling for demand prediction, a very useful but indeed difficult task. After doing some re...… Continue reading

Transformer Networks

Published on May 07, 2021

Partially Observed Data

Published on April 30, 2021