On developers’ practices for hazard diagnosis in machine learning systems

Doctoral thesis (2023)

Authors

A.M.A. Balayn

Research Group

Web Information Systems () (TU Delft)

DOI: https://doi.org/10.4233/uuid:ea94239f-5e95-4705-9deb-32196d74daaa

Machine learning Machine learning practitioners Algorithmic harms Algorithmic fairness Algorithmic robustness Algorithmic explainability Mixed-method Qualitative studies

To reference this document use:

http://resolver.tudelft.nl/uuid:ea94239f-5e95-4705-9deb-32196d74daaa

More Info

expand_more

Published Date

2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Software Technology

Research Group

Web Information Systems

Abstract

Machine learning (ML) is an artificial intelligence technology that has a great potential for being adopted in various sectors of activities. Yet, it is now also increasingly recognized as a hazardous technology. Failures in the outputs of an ML system might cause physical or social harms. Besides, the development and deployment of an ML system itself are also argued to be harmful in certain contexts.

Surprisingly, these hazards persist in applications where ML technology has been deployed, despite the increasing amount of research performed by the ML research community. In this thesis, we task ourselves with the challenges of understanding the reasons for the subsistence of hazardous system’s output failures and of hazardous development and deployment processes in practice, and of developing solutions to further diagnose these hazardous failures (especially in the system’s outputs). For that, we investigate further the nature of the potential gap between research and the practices of those developers who build and deploy the systems. To do so, we survey major related ML research directions, surface developers practices and challenges, and search for types of (mis)alignment between theory and practices. There, among others, we find a lack of technical support for ML developers to identify the potential failures of their systems. Hence, we then tackle the development and evaluation of a human-in-the-loop, explainability-based, failure diagnosis method and user-interface for computer vision systems...

Files

TUD_Dissertation_Balayn_FinalV... (pdf)

(pdf | 75.2 Mb)