"Are Compliments Bad Now?" Comparing LLMs and Human Interpretations of Gender Microaggressions in the Workplace

None, None; None, None; None, None; None, None

"Are Compliments Bad Now?" Comparing LLMs and Human Interpretations of Gender Microaggressions in the Workplace

Conference Paper (2026)

Author(s)

Catalina Lagos Rojas (TU Delft - Industrial Design Engineering)

Hüseyin Ugur Genç (TU Delft - Industrial Design Engineering)

Alessandro Bozzon (TU Delft - Industrial Design Engineering)

Sara Colombo (TU Delft - Industrial Design Engineering)

Research Group

Knowledge and Intelligence Design

Gender Large Language Model Human-AI alignment Microaggressions

DOI related publication

https://doi.org/10.1145/3772318.3790436 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:9c6472c6-0403-4fad-b658-cc5b2b8256d0

More Info

expand_more

Publication Year

2026

Language

English

Research Group

Knowledge and Intelligence Design

Article number

1519

Publisher

ACM

ISBN (electronic)

9798400722783

Event

2026 CHI Conference on Human Factors in Computing Systems, CHI 2026 (2026-04-13 - 2026-04-17), Barcelona, Spain

Downloads counter

24

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Gender microaggressions are subtle yet persistent forms of discrimination in workplace interactions. While LLMs can detect them in written texts, it remains poorly understood how their interpretations align or diverge from human perspectives and experiences. We present a mixed-method study comparing how LLMs and humans differing in gender identity and lived experience, interpret gender microaggressions in the workplace. Using short dialogues adapted from real-world accounts, we asked 141 participants to rate the likelihood that a scenario contains a microaggression and provide a rationale for their answers. The same tasks were completed by 7 different LLM models. Our analysis reveals significant differences in how humans and LLMs interpret microaggressions, captured in both ratings and rationales, and more interestingly, the effect of gender and lived experience on human interpretations. These findings highlight the need for systems detecting microaggressions to embrace interpretive plurality, and support reflection and awareness while accounting for ambiguity.

Files

3772318.3790436.pdf

(pdf | 1.28 Mb)