A.M. Demetriou | TU Delft Repository

Annotation Practices in Societally Impactful Machine Learning Applications

What are these automated systems actually trained on?

Bachelor thesis (2025) - S. Lupșa (author) , A.M. Demetriou (mentor) , CCS Liem (mentor) , Jie Yang (graduation committee member)

This study examines dataset annotation practices in influential NeurIPS research. Datasets employed in highly cited NeurIPS papers were assessed based on criteria concerning their item population, labelling schema, and annotation process. While high-level information, such as the ...

Annotation Practices in Societally Impactful Machine Learning Applications

What are these automated systems actually trained on?

Bachelor thesis (2025) - D. Košutić (author) , A.M. Demetriou (mentor) , CCS Liem (mentor) , Jie Yang (graduation committee member)

The output of machine learning (ML) models can be only as good as the data that is fed into them. Because of this, when making datasets for creating ML models, it is important to ensure the quality of the data. This is especially true of human labeled data, which can be hard to s ...

Dataset quality within a societally impactful machine learning domain

An overview of data collection and annotation practices of the datasets used by papers published by the ACL

Bachelor thesis (2025) - A. Fazakas (author) , CCS Liem (mentor) , Jie Yang (graduation committee member) , A.M. Demetriou (mentor)

This study gives an overview of the data collection and annotation practices of the datasets used by the most impactful papers published by the Association of Computational Linguistics (ACL). This was achieved by selecting the most highly cited papers published within the ACL ant ...

Benchmark Blindspots: A systematic audit of documentation decay in TPAMI’s∗datasets

Bachelor thesis (2025) - A. Despan (author) , A.M. Demetriou (mentor) , CCS Liem (mentor) , Jie Yang (graduation committee member)

High-impact vision research still rests on datasets whose labels arrive via opaque, rarely documented pipelines. To understand how serious the problem is inside a large venue, we audited 75 TPAMI papers (2009-2024) that rely or introduce datasets. Each datase ...

Behind the Labels: Transparency Pitfalls in Annotation Practices for Societally Impactful ML

A deep dive into annotation transparency and consistency in CVPR corpus

Bachelor thesis (2025) - C. Scorţia (author) , A.M. Demetriou (mentor) , CCS Liem (mentor) , Jie Yang (graduation committee member)

This study investigates annotation and reporting practices in machine learning (ML) research, focusing on societally impactful applications presented at the IEEE/CVF Computer Vision and Pattern Recognition (CVPR) conferences. By structurally analyzing the 75 most-cited CVPR paper ...

Investigating Data Collection and Reporting Practices of Human Annotations in Societally Impactful Machine Learning Applications

A Systematic Review of Top-Cited IEEE Access Papers

Bachelor thesis (2023) - A. Ibrahim (author) , CCS Liem (mentor) , A.M. Demetriou (mentor) , F. Broz (graduation committee member)

This systematic review investigates the practices and implications of human annotations in machine learning (ML) research. Analyzing a selection of 100 papers from the IEEE Access Journal, the study explores the data collection and reporting methods employed. The findings reveal ...

Annotation practices in affective computing

What are these algorithms actually trained on?

Bachelor thesis (2023) - S.J.M. Backer (author) , CCS Liem (mentor) , A.M. Demetriou (mentor) , F. Broz (graduation committee member)

In the machine learning research community, significant importance is given to the optimization of techniques which are employed once a benchmark dataset is given. However, less importance is assigned to the quality of these datasets and to how these datasets are obtained. In thi ...

Annotation Practices in Societally Impactful Machine Learning Applications

What are the recommender systems models actually trained on?

Bachelor thesis (2023) - A.G. Sav (author) , CCS Liem (mentor) , A.M. Demetriou (mentor) , F. Broz (graduation committee member)

Machine Learning models are nowadays infused into all aspects of our lives. Perhaps one of its most common applications regards recommender systems, as they facilitate users' decision-making processes in various scenarios (e.g., e-commerce, social media, news, online learning, et ...

Annotation Practices in Machine Learning Research On Depression

Bachelor thesis (2023) - A. Andrasz (author) , CCS Liem (mentor) , A.M. Demetriou (mentor) , F. Broz (graduation committee member)

Depression diagnosis and treatment remain difficult tasks that could be improved with machine learning models. But those automatic systems should be reliable to apply in clinical psychology settings. Performing predictions in this field is most commonly done using supervised lear ...

A Quest through Interconnected Datasets: Research on Annotation Practices in Highly Cited Audio Machine Learning Work and Their Utilized Datasets

Annotation Practices in Datasets Utilized by The International Conference on Acoustics, Speech, and Signal Processing (ICASSP) Conferences: A Transparency Analysis

Bachelor thesis (2023) - D. Taşcılar (author) , CCS Liem (mentor) , A.M. Demetriou (mentor) , F. Broz (graduation committee member)

This research examines transparency between ICASSP conference papers and the dataset documentations related to the datasets' annotation practices. Top-cited 5 papers and 51 unique resources in total were considered. All of the selected papers utilized at least one dataset. For ev ...