Annotation Practices in Societally Impactful Machine Learning Applications

What are these automated systems actually trained on?

Bachelor Thesis (2025)
Author(s)

D. Košutić (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Andrew Demetriou – Mentor (TU Delft - Multimedia Computing)

Cynthia C.S. Liem – Mentor (TU Delft - Multimedia Computing)

J. Yang – Graduation committee member (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
25-06-2025
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The output of machine learning (ML) models can be only as good as the data that is fed into them. Because of this, when making datasets for creating ML models, it is important to ensure the quality of the data. This is especially true of human labeled data, which can be hard to standardize and assess the quality of. To assess the annotation practices of human labeled data in the field of machine learning, this paper investigates the datasets used in the highest cited papers in the AAAI Conference on Artificial Intelligence, an influential machine learning conference. After extracting the datasets from 75 papers in three overlapping publication periods, the top 20 datasets were evaluated from each period. The results showed that the majority of datasets do not use or underreport significant annotation practices, specifically about the annotators and the annotation process. This raises concern for the conference and the field more broadly, as the most influential papers build their machine learning algorithms on quite possibly low quality data. However, there is some hope for the field in this regard as the more recent papers use datasets with better quality annotation practices.

Files

RP_Report-7.pdf
(pdf | 0.966 Mb)
License info not available