Faulty or Ready? Handling Failures in Deep-Learning Computer Vision Models until Deployment

None, None; None, None; None, None; None, None

Faulty or Ready? Handling Failures in Deep-Learning Computer Vision Models until Deployment

A Study of Practices, Challenges, and Needs

Conference Paper (2023)

Author(s)

A.M.A. Balayn (TU Delft - Web Information Systems)

N. Rikalo (TU Delft - Human-Centred Artificial Intelligence)

J. Yang (TU Delft - Web Information Systems)

A Bozzon (TU Delft - Human-Centred Artificial Intelligence)

Research Group

Web Information Systems

Copyright

DOI related publication

https://doi.org/10.1145/3544548.3581555

Debugging Practices Explainability Machine learning testing

To reference this document use:

https://resolver.tudelft.nl/uuid:2a445dad-39b5-4fa1-b0b6-e2541e68aa70

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Research Group

Web Information Systems

ISBN (print)

978-1-4503-9421-5

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Handling failures in computer vision systems that rely on deep learning models remains a challenge. While an increasing number of methods for bug identification and correction are proposed, little is known about how practitioners actually search for failures in these models. We perform an empirical study to understand the goals and needs of practitioners, the workflows and artifacts they use, and the challenges and limitations in their process. We interview 18 practitioners by probing them with a carefully crafted failure handling scenario. We observe that there is a great diversity of failure handling workflows in which cooperations are often necessary, that practitioners overlook certain types of failures and bugs, and that they generally do not rely on potentially relevant approaches and tools originally stemming from research. These insights allow to draw a list of research opportunities, such as creating a library of best practices and more representative formalisations of practitioners' goals, developing interfaces to exploit failure handling artifacts, as well as providing specialized training.

Files

3544548.3581555.pdf

(pdf | 2.66 Mb)