D. Chen

info

Please Note

<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>

Bachelor thesis (1)

Master thesis (1)

2 records found

A Human-In-the-Loop Framework to Assess Multimodal Machine Learning Models

Master thesis (2022) - D. Chen, J. Yang, G.J.P.M. Houben, G. Lan, A. Tocchetti

Recent works explain the DNN models that perform image classification tasks following the "attribution, human-in-the-loop, extraction" workflow. However, little work has looked into such an approach for explaining DNN models for language or multimodal tasks. To address this gap, we propose a framework that explains and assesses the model utilizing both the categorical/numerical features and the text while optimizing the "attribution, human-in-the-loop, extraction" workflow. In particular, our framework deals with limited human resources, especially when domain experts are required for human-in-the-loop tasks. It provides insight regarding which set of data should the human-in-the-loop tasks be brought in. We share the results of applying this framework to a multimodal transformer that performs text classification tasks for compliance detection in the financial context.

...

Using Skip-Gram Model to Predict from which Show a Given Line is

Bachelor thesis (2020) - Dina Chen, T.J. Viering, A. Naseri Jahfari, S. Makrodimitris

Text classification has a wide range of usage such as extracting the sentiment out of a product review, analyzing the topic of a document and spam detection. In this research, the text classification task is to predict from which TV-show a given line is. The skip-gram model, originally used to train the Word2Vec sentence embeddings [Mikolov et al, 2013], is adapted to determine the likelihood of occurrence of a sentence in a TV-show. Based on this feature, a classifier is built to perform the task of this research. The results of the cross-validation show that it reaches an accuracy of 58% when running on the transcript data of 3 shows and 43% on 4 shows, while the accuracies of random guessing are supposed to be 33% and 25%. The difference between the neural networks and the skip-gram model becomes smaller when more shows are added to evaluate the model. Among each 5 fold cross-validation of the two models, the best results appear in the midmost iterations. ...