Annotation Practices in Societally Impactful Machine Learning Applications

None, None

Annotation Practices in Societally Impactful Machine Learning Applications

What are these automated systems actually trained on?

Bachelor Thesis (2025)

Author(s)

S. Lupșa (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A.M. Demetriou – Mentor (TU Delft - Multimedia Computing)

Cynthia C. S. Liem – Mentor (TU Delft - Multimedia Computing)

J. Yang – Graduation committee member (TU Delft - Web Information Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use:

https://resolver.tudelft.nl/uuid:1619c18d-32ae-4aef-a908-a2da212a9d6b

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

27-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This study examines dataset annotation practices in influential NeurIPS research. Datasets employed in highly cited NeurIPS papers were assessed based on criteria concerning their item population, labelling schema, and annotation process. While high-level information, such as the presence of human labellers and item population, is present in most cases, procedural details of the annotation process are poorly reported. Notably, 48% of datasets lack details on annotator training, 43% omit inter-rater reliability, and 28% are not publicly accessible. Temporal comparisons show minor improvements, but no substantial progress in reporting annotation methodology. A complementary analysis of 49 NeurIPS papers published since 2020 shows that researchers often discuss the broader impact of their work, yet do not include datasets or their annotations in these assessments. These findings highlight a lack of standardisation in annotation reporting and call for more robust practices that ensure transparency, auditability, and reproducibility in machine learning research.

Files

Thesis_1_.pdf

(pdf | 2.26 Mb)

License info not available