A.M.A. Balayn | TU Delft Repository

Towards Effective Human Intervention in Algorithmic Decision-Making

Understanding the Effect of Decision-Makers' Configuration on Decision-Subjects' Fairness Perceptions

Conference paper (2025) - M. Yurrita Semperena (author) , Himanshu Verma (author) , A.M.A. Balayn (author) , Ujwal Gadiraju (author) , S. C. Pont (author) , A Bozzon (author)

Human intervention is claimed to safeguard decision-subjects’ rights in algorithmic decision-making and contribute to their fairness perceptions. However, how decision-subjects perceive hybrid decision-maker configurations (i.e., combining humans and algorithms) is unclear. We ad ...

A.I. Robustness

A Human-Centered Perspective on Technological Challenges and Opportunities

Journal article (2025) - Andrea Tocchetti (author) , Lorenzo Corti (author) , A.M.A. Balayn (author) , M. Yurrita Semperena (author) , Philip Lippmann (author) , Marco Brambilla (author) , Jie Yang (author)

Despite the impressive performance of Artificial Intelligence (AI) systems, their robustness remains elusive and constitutes a key issue that impedes large-scale adoption. Besides, robustness is interpreted differently across domains and contexts of AI. In this work, we systemati ...

“It Is a Moving Process”

Understanding the Evolution of Explainability Needs of Clinicians in Pulmonary Medicine

Conference paper (2024) - Lorenzo Corti (author) , Rembrandt Oltmans (author) , Jiwon Jung (author) , A.M.A. Balayn (author) , Marlies S. Wijsenbeek (author) , J Yang (author)

Clinicians increasingly pay attention to Artificial Intelligence (AI) to improve the quality and timeliness of their services. There are converging opinions on the need for Explainable AI (XAI) in healthcare. However, prior work considers explanations as stationary entities with ...

Explainability in AI Policies

A Critical Review of Communications, Reports, Regulations, and Standards in the EU, US, and UK

Conference paper (2023) - Luca Nannini (author) , A.M.A. Balayn (author) , Adam Leon Smith (author)

Public attention towards explainability of artificial intelligence (AI) systems has been rising in recent years to offer methodologies for human oversight. This has translated into the proliferation of research outputs, such as from Explainable AI, to enhance transparency and con ...

On developers’ practices for hazard diagnosis in machine learning systems

Doctoral thesis (2023) - A.M.A. Balayn (author) , Geert Jan Houben (promotor) , A Bozzon (promotor)

Machine learning (ML) is an artificial intelligence technology that has a great potential for being adopted in various sectors of activities. Yet, it is now also increasingly recognized as a hazardous technology. Failures in the outputs of an ML system might cause physical or soc ...

“☑ Fairness Toolkits, A Checkbox Culture?” On the Factors that Fragment Developer Practices in Handling Algorithmic Harms

Conference paper (2023) - A.M.A. Balayn (author) , M. Yurrita Semperena (author) , J. Yang (author) , Ujwal Gadiraju (author)

Fairness toolkits are developed to support machine learning (ML) practitioners in using algorithmic fairness metrics and mitigation methods. Past studies have investigated practical challenges for toolkit usage, which are crucial to understanding how to support practitioners. How ...

Hear Me Out

A Study on the Use of the Voice Modality for Crowdsourced Relevance Assessments

Conference paper (2023) - N. Roy (author) , A.M.A. Balayn (author) , D.M. Maxwell (author) , C Hauff (author)

The creation of relevance assessments by human assessors (often nowadays crowdworkers) is a vital step when building IR test collections. Prior works have investigated assessor quality & behaviour, and tooling to support assessors in their task. We have few insights though in ...

Perspective

Leveraging Human Understanding for Identifying and Characterizing Image Atypicality

Conference paper (2023) - S. Sharifi Noorian (author) , Sihang Qui (author) , Burcu Sayin (author) , A.M.A. Balayn (author) , Ujwal Gadiraju (author) , J Yang (author) , Alessandro Bozzon (author)

High-quality data plays a vital role in developing reliable image classification models. Despite that, what makes an image difficult to classify remains an unstudied topic. This paper provides a first-of-its-kind, model-agnostic characterization of image atypicality based on huma ...

Faulty or Ready? Handling Failures in Deep-Learning Computer Vision Models until Deployment

A Study of Practices, Challenges, and Needs

Conference paper (2023) - A.M.A. Balayn (author) , N. Rikalo (author) , J. Yang (author) , Alessandro Bozzon (author)

Handling failures in computer vision systems that rely on deep learning models remains a challenge. While an increasing number of methods for bug identification and correction are proposed, little is known about how practitioners actually search for failures in these models. We p ...

Disentangling Fairness Perceptions in Algorithmic Decision-Making

The Effects of Explanations, Human Oversight, and Contestability

Conference paper (2023) - M. Yurrita Semperena (author) , Tim Draws (author) , A.M.A. Balayn (author) , D.S. Murray-Rust (author) , Nava Tintarev (author) , A Bozzon (author)

Recent research claims that information cues and system attributes of algorithmic decision-making processes affect decision subjects' fairness perceptions. However, little is still known about how these factors interact. This paper presents a user study (N = 267) investigating th ...

Towards a multi-stakeholder value-based assessment framework for algorithmic systems

Conference paper (2022) - M. Yurrita Semperena (author) , D.S. Murray-Rust (author) , A.M.A. Balayn (author) , Alessandro Bozzon (author)

In an effort to regulate Machine Learning-driven (ML) systems, current auditing processes mostly focus on detecting harmful algorithmic biases. While these strategies have proven to be impactful, some values outlined in documents dealing with ethics in ML-driven systems are still ...

It Is Like Finding a Polar Bear in the Savannah! Concept-level AI Explanations with Analogical Inference from Commonsense Knowledge

Conference paper (2022) - G. He (author) , A.M.A. Balayn (author) , S.N.R. Buijsman (author) , Jie Yang (author) , Ujwal Gadiraju (author)

With recent advances in explainable artificial intelligence (XAI), researchers have started to pay attention to concept-level explanations, which explain model predictions with a high level of abstraction. However, such explanations may be difficult to digest for laypeople due to ...

Ready Player One!

Eliciting Diverse Knowledge Using A Configurable Game

Conference paper (2022) - A.M.A. Balayn (author) , G. He (author) , Andrea Hu (author) , J. Yang (author) , Ujwal Gadiraju (author)

Access to commonsense knowledge is receiving renewed interest for developing neuro-symbolic AI systems, or debugging deep learning models. Little is currently understood about the types of knowledge that can be gathered using existing knowledge elicitation methods. Moreover, thes ...

How can Explainability Methods be Used to Support Bug Identification in Computer Vision Models?

Conference paper (2022) - A.M.A. Balayn (author) , N. Rikalo (author) , Christoph Lofi (author) , Jie Yang (author) , A Bozzon (author)

Deep learning models for image classification suffer from dangerous issues often discovered after deployment. The process of identifying bugs that cause these issues remains limited and understudied. Especially, explainability methods are often presented as obvious tools for bug ...

Automatic Identification of Harmful, Aggressive, Abusive, and Offensive Language on the Web

A Survey of Technical Biases Informed by Psychology Literature

Journal article (2021) - A.M.A. Balayn (author) , J. Yang (author) , Zoltán Szlávik (author) , Alessandro Bozzon (author)

The automatic detection of conflictual languages (harmful, aggressive, abusive, and offensive languages) is essential to provide a healthy conversation environment on the Web. To design and develop detection systems that are capable of achieving satisfactory performance, a thorou ...

Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems

Journal article (2021) - A.M.A. Balayn (author) , Christoph Lofi (author) , Geert Jan Houben (author)

The increasing use of data-driven decision support systems in industry and governments is accompanied by the discovery of a plethora of bias and unfairness issues in the outputs of these systems. Multiple computer science communities, and especially machine learning, have started ...

What do You Mean? Interpreting Image Classification with Crowdsourced Concept Extraction and Analysis

Conference paper (2021) - A.M.A. Balayn (author) , Panagiotis Soilis (author) , Christoph Lofi (author) , J. Yang (author) , A Bozzon (author)

Global interpretability is a vital requirement for image classification applications. Existing interpretability methods mainly explain a model behavior by identifying salient image patches, which require manual efforts from users to make sense of, and also do not typically suppor ...

Characterising and Mitigating Aggregation-Bias in Crowdsourced Toxicity Annotations

Conference paper (2018) - A.M.A. Balayn (author) , P. Mavridis (author) , A Bozzon (author) , Benjamin Timmermans (author) , Zoltán Szlávik (author)

Training machine learning (ML) models for natural language processing usually requires large amount of data, often acquired through crowdsourcing. The way this data is collected and aggregated can have an effect on the outputs of the trained model such as ignoring the labels whic ...