Faulty or Ready? Handling Failures in Deep-Learning Computer Vision Models until Deployment

A Study of Practices, Challenges, and Needs

Conference Paper (2023)
Author(s)

A.M.A. Balayn (TU Delft - Web Information Systems)

N. Rikalo (TU Delft - Human-Centred Artificial Intelligence)

J. Yang (TU Delft - Web Information Systems)

Alessandro Bozzon (TU Delft - Human-Centred Artificial Intelligence)

Research Group
Web Information Systems
Copyright
© 2023 A.M.A. Balayn, N. Rikalo, J. Yang, A. Bozzon
DOI related publication
https://doi.org/10.1145/3544548.3581555
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 A.M.A. Balayn, N. Rikalo, J. Yang, A. Bozzon
Research Group
Web Information Systems
ISBN (print)
978-1-4503-9421-5
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Handling failures in computer vision systems that rely on deep learning models remains a challenge. While an increasing number of methods for bug identification and correction are proposed, little is known about how practitioners actually search for failures in these models. We perform an empirical study to understand the goals and needs of practitioners, the workflows and artifacts they use, and the challenges and limitations in their process. We interview 18 practitioners by probing them with a carefully crafted failure handling scenario. We observe that there is a great diversity of failure handling workflows in which cooperations are often necessary, that practitioners overlook certain types of failures and bugs, and that they generally do not rely on potentially relevant approaches and tools originally stemming from research. These insights allow to draw a list of research opportunities, such as creating a library of best practices and more representative formalisations of practitioners' goals, developing interfaces to exploit failure handling artifacts, as well as providing specialized training.