A Human-In-the-Loop Framework to Assess Multimodal Machine Learning Models
D. Chen (TU Delft - Electrical Engineering, Mathematics and Computer Science)
J Yang – Mentor (TU Delft - Web Information Systems)
Geert Jan Houben – Graduation committee member (TU Delft - Web Information Systems)
G. Guohao – Graduation committee member (TU Delft - Embedded Systems)
Andrea Tocchetti – Mentor
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Recent works explain the DNN models that perform image classification tasks following the "attribution, human-in-the-loop, extraction" workflow. However, little work has looked into such an approach for explaining DNN models for language or multimodal tasks. To address this gap, we propose a framework that explains and assesses the model utilizing both the categorical/numerical features and the text while optimizing the "attribution, human-in-the-loop, extraction" workflow. In particular, our framework deals with limited human resources, especially when domain experts are required for human-in-the-loop tasks. It provides insight regarding which set of data should the human-in-the-loop tasks be brought in. We share the results of applying this framework to a multimodal transformer that performs text classification tasks for compliance detection in the financial context.