A Human-In-the-Loop Framework to Assess Multimodal Machine Learning Models

None, None

A Human-In-the-Loop Framework to Assess Multimodal Machine Learning Models

Master Thesis (2022)

Author(s)

D. Chen (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J Yang – Mentor (TU Delft - Web Information Systems)

Geert Jan Houben – Graduation committee member (TU Delft - Web Information Systems)

G. Guohao – Graduation committee member (TU Delft - Embedded Systems)

Andrea Tocchetti – Mentor

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

NLP Multimodal XAI Human-in-the-loop

To reference this document use:

https://resolver.tudelft.nl/uuid:806e001d-9bf3-49e1-b2f4-298c747aea2a

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Graduation Date

30-11-2022

Awarding Institution

Delft University of Technology

Programme

Computer Science

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Recent works explain the DNN models that perform image classification tasks following the "attribution, human-in-the-loop, extraction" workflow. However, little work has looked into such an approach for explaining DNN models for language or multimodal tasks. To address this gap, we propose a framework that explains and assesses the model utilizing both the categorical/numerical features and the text while optimizing the "attribution, human-in-the-loop, extraction" workflow. In particular, our framework deals with limited human resources, especially when domain experts are required for human-in-the-loop tasks. It provides insight regarding which set of data should the human-in-the-loop tasks be brought in. We share the results of applying this framework to a multimodal transformer that performs text classification tasks for compliance detection in the financial context.

Files

Title_11_.pdf

(pdf | 10.6 Mb)

License info not available