Surfacing Differences in Practices When Building Fair Machine Learning Systems with Fairness Toolkits: an Empirical Study

Bachelor Thesis (2022)
Author(s)

E. Noritsyna (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J. Yang – Mentor (TU Delft - Web Information Systems)

Ujwal Gadiraju – Mentor (TU Delft - Web Information Systems)

Agathe Balayn – Mentor (TU Delft - Web Information Systems)

F. Broz – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Eva Noritsyna
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Eva Noritsyna
Graduation Date
24-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The ability to identify and mitigate various risks and harms of using Machine Learning models in industry is an essential task. Specifically because these may produce harmful outcomes for stakeholders, including unfair or discriminatory results. Due to this there has been substantial research into the concepts of fairness and its metrics, bias and its mitigation, and algorithmic harms and their sources. Various toolkits have been created to guide practitioners to reflect on these topics and provide suggestions on algorithmic solutions to mitigate these risks. However, it is not yet known how widely these toolkits are used and how they are perceived in terms of usefulness. In this project, practitioners were interviewed in order to determine to what extend do envisioned practices of practitioners without experience with fairness toolkits differ from those with the experience. The two toolkits considered were the IBM AI Fairness360 and Microsoft FairLearn. The data collected from the interviews suggests that there could be fewer differences in practices of practitioners with experience and without experience with toolkits, than those with training or work roles in ethics and fairness in ML and those without. This suggests that experience the toolkit itself is not indicative of a more thorough approach to identifying and mitigating harms in fair Machine Learning.

Files

License info not available