J. Yang | TU Delft Repository

Routing distilled knowledge via mixture of LoRA experts for large language model based bundle generation

Journal article (2026) - Kaidong Feng, Zhu Sun, Hui Fang, Jie Yang, Wenyuan Liu

Large Language Models (LLMs) have shown potential in automatic bundle generation but suffer from prohibitive computational costs. Although knowledge distillation offers a pathway to more efficient student models, our preliminary study reveals that naively integrating diverse types of distilled knowledge from teacher LLMs into student LLMs leads to knowledge conflict, negatively impacting the performance of bundle generation. To address this, we propose RouteDK, a framework for routing distilled knowledge through a mixture of Low-Rank Adaptation (LoRA) expert architecture. Specifically, we first distill two complementary types of knowledge from the teacher LLM: high-level knowledge (generalizable rules) and fine-grained knowledge (session-specific reasoning). We then train knowledge-specific LoRA experts for each type of knowledge together with a base LoRA expert. For effective integration, we propose a dynamic fusion module, featuring an input-aware router, where the router balances expert contributions by dynamically determining optimal weights based on input, thereby effectively mitigating knowledge conflicts. To further improve inference reliability, we design an inference-time enhancement module to reduce variance and mitigate suboptimal reasoning. Experiments on three public datasets show that our RouteDK achieves accuracy comparable to or even better than the teacher LLM, while maintaining strong computational efficiency. In addition, it outperforms state-of-the-art approaches for bundle generation. ...

"label from Somewhere"

Reflexive Annotating for Situated AI Alignment

Conference paper (2026) - Anne Arzberger, Céline Offerman, Ujwal Gadiraju, Alessandro Bozzon, Jie Yang

AI alignment relies on annotator judgments, yet annotation pipelines often treat annotators as interchangeable, obscuring how their social position shapes annotation. We introduce reflexive annotating as a probe that invites crowd workers to reflect on how their positionality informs subjective annotation judgments in a language model alignment context. Through a qualitative study with crowd workers (N = 30), including follow-up interviews (N = 5), we examine how our probe shapes annotators' behaviour, experience, and the situated metadata it elicits. We find that reflexive annotating captures epistemic metadata beyond static demographics by eliciting intersectional reasoning, surfacing positional humility, and nudging viewpoint change. Crucially, we also denote tensions between reflexive engagement and affective demands such as emotional exposure. We discuss the implications of our work for richer value elicitation and alignment practices that treat annotator judgments as situated and selectively integrate positional metadata. ...

Does Knowledge Distillation Matter for Large Language Model-Based Bundle Generation?

Journal article (2026) - Kaidong Feng, Zhu Sun, Jie Yang, Hui Fang, Xinghua Qu, Wenyuan Liu

Large Language Models (LLMs) have been extensively applied in various recommendation scenarios, including bundle generation, thanks to their exceptional reasoning capabilities and comprehensive knowledge. However, exploiting large-scale LLMs for bundle generation introduces significant efficiency challenges—primarily high computational costs during fine-tuning and inference due to their massive parameterization. Knowledge Distillation (KD) offers a promising solution by transferring expertise from large teacher models to more compact student models. This study systematically investigates KD approaches for bundle generation with the goal of minimizing computational demands while preserving performance. Specifically, we explore three critical research questions: (1) how does the format of distilled knowledge impact bundle generation performance? (2) to what extent does the quantity of distilled knowledge influence the performance? and (3) how do different ways of utilizing the distilled knowledge affect the performance? To support this investigation, we propose a comprehensive KD framework that (i) progressively extracts knowledge from raw data in increasingly complex forms, i.e., frequent patterns → formalized rules → deep thoughts; (ii) captures varying quantities of distilled knowledge through different sampling strategies, multi-domain accumulation, and multi-format aggregation; and (iii) exploits complementary LLM adaptation techniques—in-context learning, supervised fine-tuning, and their combination—to leverage the distilled knowledge for domain-specific adaptation and enhanced efficiency in small student models. Through extensive experiments on multiple real-world datasets, we provide valuable insights into how knowledge format, quantity, and utilization methods collectively shape the performance of LLM-based bundle generation, which exhibits the significant potential of KD for more efficient yet effective LLM-based bundle generation. ...

Large Language Models (LLMs) have been extensively applied in various recommendation scenarios, including bundle generation, thanks to their exceptional reasoning capabilities and comprehensive knowledge. However, exploiting large-scale LLMs for bundle generation introduces significant efficiency challenges—primarily high computational costs during fine-tuning and inference due to their massive parameterization. Knowledge Distillation (KD) offers a promising solution by transferring expertise from large teacher models to more compact student models. This study systematically investigates KD approaches for bundle generation with the goal of minimizing computational demands while preserving performance. Specifically, we explore three critical research questions: (1) how does the format of distilled knowledge impact bundle generation performance? (2) to what extent does the quantity of distilled knowledge influence the performance? and (3) how do different ways of utilizing the distilled knowledge affect the performance? To support this investigation, we propose a comprehensive KD framework that (i) progressively extracts knowledge from raw data in increasingly complex forms, i.e., frequent patterns → formalized rules → deep thoughts; (ii) captures varying quantities of distilled knowledge through different sampling strategies, multi-domain accumulation, and multi-format aggregation; and (iii) exploits complementary LLM adaptation techniques—in-context learning, supervised fine-tuning, and their combination—to leverage the distilled knowledge for domain-specific adaptation and enhanced efficiency in small student models. Through extensive experiments on multiple real-world datasets, we provide valuable insights into how knowledge format, quantity, and utilization methods collectively shape the performance of LLM-based bundle generation, which exhibits the significant potential of KD for more efficient yet effective LLM-based bundle generation.

Co-Data

Cultivating Effective Human-LLM Collaboration for Collaborative Data Processing

Conference paper (2026) - Amedeo Pachera, Andrea Mauri, Kashif Imteyaz, Jie Yang, Eric Umuhoza, Angela Bonifati, Michal Lahav, Nitesh Goyal

Data work is increasingly collaborative and multidisciplinary, yet teams struggle with mismatched semantics, uneven data literacy, and variable trust in automation. Large Language Models (LLMs) now assist with cleaning, integration, annotation, and querying, but their role in mediating collaboration, aligning goals, translating vocabularies, coordinating decisions, remains underexplored. This workshop examines LLMs as collaborators, in data-intensive workflows. We take Interdependence Theory as a starting lens to reason about dependence, mutual responsiveness, and shared outcomes in human-LLM interaction, while explicitly reasoning on its fit and considering alternative or complementary frameworks. Through interactive discussions and case-driven activities, we will surface core principles, design considerations, and evaluation criteria (e.g., trust, coordination, equity, transparency) for human-LLM collaboration. Expected outcomes include an initial conceptual framework, high-level guidance for practice, and a forward-looking research agenda to inform the design and assessment of collaborative, responsible LLM-enabled data workflows. ...

Preface

Foreword postscript (2026) - Himanshu Verma, Alessandro Bozzon, Andrea Mauri, Jie Yang

Co-Constructing Alignment

A Participatory Approach to Situating AI Values

Conference paper (2026) - Anne Arzberger, Enrico Liscio, Maria Luce Lupetti, Íñigo De Troya, Jie Yang

As AI systems become embedded in everyday practice, value misalignment has emerged as a pressing concern. Yet, dominant alignment approaches remain model-centric, treating users as passive recipients of pre-specified values rather than as epistemic agents who encounter and respond to misalignment during interactions. Drawing on situated perspectives, we frame alignment as an interactional practice co-constructed during human-AI interaction. We investigate how users understand and wish to contribute to this process through a participatory workshop that combines misalignment diaries with generative design activities. We surface how misalignments materialise in practice and how users envision acting on them, grounded in the context of researchers using Large Language Models as research assistants. Our findings show that misalignments are experienced less as abstract ethical violations than as unexpected responses, and task or social breakdowns. Participants articulated roles ranging from adjusting and interpreting model behaviour to deliberate non-engagement as an alignment strategy. We conclude with implications for system design that supports alignment as ongoing, situated, and shared practice. ...

Rethinking and recomputing the value of machine learning models

Journal article (2025) - Burcu Sayin, Jie Yang, Xinyue Chen, Andrea Passerini, Fabio Casati

In this paper, we argue that the prevailing approach to training and evaluating machine learning models often fails to consider their real-world application within organizational or societal contexts, where they are intended to create beneficial value for people. We propose a shift in perspective, redefining model assessment and selection to emphasize integration into workflows that combine machine predictions with human expertise, particularly in scenarios requiring human intervention for low-confidence predictions. Traditional metrics like accuracy and f-score fail to capture the beneficial value of models in such hybrid settings. To address this, we introduce a simple yet theoretically sound “value” metric that incorporates task-specific costs for correct predictions, errors, and rejections, offering a practical framework for real-world evaluation. Through extensive experiments, we show that existing metrics fail to capture real-world needs, often leading to suboptimal choices in terms of value when used to rank classifiers. Furthermore, we emphasize the critical role of calibration in determining model value, showing that simple, well-calibrated models can often outperform more complex models that are challenging to calibrate. ...

Context-Informed Machine Translation of Manga using Multimodal Large Language Models

Conference paper (2025) - Philip Lippmann, Konrad Skublicki, Joshua Tanner, Shonosuke Ishiwatari, Jie Yang

Due to the significant time and effort required for handcrafting translations, most manga never leave the domestic Japanese market. Automatic manga translation is a promising potential solution. However, it is a budding and underdeveloped field and presents complexities even greater than those found in standard translation due to the need to effectively incorporate visual elements into the translation process to resolve ambiguities. In this work, we investigate to what extent multimodal large language models (LLMs) can provide effective manga translation, thereby assisting manga authors and publishers in reaching wider audiences. Specifically, we propose a methodology that leverages the vision component of multimodal LLMs to improve translation quality and evaluate the impact of translation unit size, context length, and propose a token efficient approach for manga translation. Moreover, we introduce a new evaluation dataset – the first parallel Japanese-Polish manga translation dataset – as part of a benchmark to be used in future research. Finally, we contribute an open-source software suite, enabling others to benchmark LLMs for manga translation. Our findings demonstrate that our proposed methods achieve state-of-the-art results for Japanese-English translation and set a new standard for Japanese-Polish. ...

A.I. Robustness

A Human-Centered Perspective on Technological Challenges and Opportunities

Journal article (2025) - Andrea Tocchetti, Lorenzo Corti, Agathe Balayn, Mireia Yurrita, Philip Lippmann, Marco Brambilla, Jie Yang

Despite the impressive performance of Artificial Intelligence (AI) systems, their robustness remains elusive and constitutes a key issue that impedes large-scale adoption. Besides, robustness is interpreted differently across domains and contexts of AI. In this work, we systematically survey recent progress to provide a reconciled terminology of concepts around AI robustness. We introduce three taxonomies to organize and describe the literature both from a fundamental and applied point of view: (1) methods and approaches that address robustness in different phases of the machine learning pipeline; (2) methods improving robustness in specific model architectures, tasks, and systems; and in addition, (3) methodologies and insights around evaluating the robustness of AI systems, particularly the tradeoffs with other trustworthiness properties. Finally, we identify and discuss research gaps and opportunities and give an outlook on the field. We highlight the central role of humans in evaluating and enhancing AI robustness, considering the necessary knowledge they can provide, and discuss the need for better understanding practices and developing supportive tools in the future. ...

Robust Link Prediction over Noisy Hyper-Relational Knowledge Graphs via Active Learning

Conference paper (2024) - Weijian Yu, Jie Yang, Dingqi Yang

Modern Knowledge Graphs (KGs) are inevitably noisy due to the nature of their construction process. Existing robust learning techniques for noisy KGs mostly focus on triple facts, where the factwise confidence is straightforward to evaluate. However, hyperrelational facts, where an arbitrary number of key-value pairs are associated with a base triplet, have become increasingly popular in modern KGs, but significantly complicate the confidence assessment of the fact. Against this background, we study the problem of robust link prediction over noisy hyper-relational KGs, and propose NYLON, a Noise-resistant hYper-reLatiONal link prediction technique via active crowd learning. Specifically, beyond the traditional fact-wise confidence, we first introduce element-wise confidence measuring the fine-grained confidence of each entity or relation of a hyper-relational fact. We connect the element- and fact-wise confidences via a “least confidence” principle to allow efficient crowd labeling. NYLON is then designed to systematically integrate three key components, where a hyper-relational link predictor uses the fact-wise confidence for robust prediction, a cross-grained confidence evaluator predicts both element- and fact-wise confidences, and an effort-efficient active labeler selects informative facts for crowd annotators to label using an efficient labeling mechanism guided by the element-wise confidence under the “least confidence” principle and further followed by data augmentation. We evaluate NYLON on three real-world KG datasets against a sizeable collection of baselines. Results show that NYLON achieves superior and robust performance in both link prediction and error detection tasks on noisy KGs, and outperforms best baselines by 2.42-10.93% and 3.46-10.65% in the two tasks, respectively. ...

“It Is a Moving Process”

Understanding the Evolution of Explainability Needs of Clinicians in Pulmonary Medicine

Conference paper (2024) - Lorenzo Corti, Rembrandt Oltmans, Jiwon Jung, Agathe Balayn, Marlies Wijsenbeek, Jie Yang

Clinicians increasingly pay attention to Artificial Intelligence (AI) to improve the quality and timeliness of their services. There are converging opinions on the need for Explainable AI (XAI) in healthcare. However, prior work considers explanations as stationary entities with no account for the temporal dynamics of patient care. In this work, we involve 16 Idiopathic Pulmonary Fibrosis (IPF) clinicians from a European university medical centre and investigate their evolving uses and purposes for explainability throughout patient care. By applying a patient journey map for IPF, we elucidate clinicians' informational needs, how human agency and patient-specific conditions can influence the interaction with XAI systems, and the content, delivery, and relevance of explanations over time. We discuss implications for integrating XAI in clinical contexts and more broadly how explainability is defined and evaluated. Furthermore, we reflect on the role of medical education in addressing epistemic challenges related to AI literacy. ...

MRHF

Multi-stage Retrieval and Hierarchical Fusion for Textbook Question Answering

Conference paper (2024) - Peide Zhu, Zhen Wang, Manabu Okumura, Jie Yang

Textbook question answering is challenging as it aims to automatically answer various questions on textbook lessons with long text and complex diagrams, requiring reasoning across modalities. In this work, we propose MRHF, a novel framework that incorporates dense passage re-ranking and the mixture-of-experts architecture for TQA. MRHF proposes a novel query augmentation method for diagram questions and then adopts multi-stage dense passage re-ranking with large pretrained retrievers for retrieving paragraph-level contexts. Then it employs a unified question solver to process different types of text questions. Considering the rich blobs and relation knowledge contained in diagrams, we propose to perform multimodal feature fusion over the retrieved context and the heterogeneous diagram features. Furthermore, we introduce the mixture-of-experts architecture to solve the diagram questions to learn from both the rich text context and the complex diagrams and mitigate the possible negative effects between features of the two modalities. We test the framework on the CK12-TQA benchmark dataset, and the results show that MRHF outperforms the state-of-the-art results in all types of questions. The ablation and case study also demonstrates the effectiveness of each component of the framework. ...

Editorial

Special Issue on Human in the Loop Data Curation

Journal article (2024) - Gianluca Demartini, Shazia Sadiq, Jie Yang

This Special Issue of the Journal of Data and Information Quality (JDIQ) contains novel theoretical and methodological contributions on data curation involving humans in the loop. In this editorial, we summarize the scope of the issue and briefly describe its content. ...

FedTrans

Client-transparent utility estimation for robust federated learning

Poster (2024) - Mingkun Yang, Ran Zhu, Qing Wang, Jie Yang

Federated Learning (FL) is an important privacy-preserving learning paradigm that plays an important role in the Intelligent Internet of Things. Training a global model in FL, however, is vulnerable to the data noise across the clients. In this paper, we introduce FedTrans, a novel client-transparent client utility estimation method designed to guide client selection for noisy scenarios, mitigating performance degradation problems. To estimate the client utility, we propose a Bayesian framework that models client utility and its relationships with the weight parameters and the performance of local models. We then introduce a variational inference algorithm to effectively infer client utility at the FL server, given only a small amount of auxiliary data. Our evaluation results demonstrate that leveraging FedTrans to select the clients can improve the accuracy performance (up to 7.8%), ensuring the robustness of FL in noisy scenarios. ...

Adaptive In-Context Learning with Large Language Models for Bundle Generation

Conference paper (2024) - Zhu Sun, Kaidong Feng, Jie Yang, Xinghua Qu, Hui Fang, Yew-Soon Ong, Wenyuan Liu

Most existing bundle generation approaches fall short in generating fixed-size bundles. Furthermore, they often neglect the underlying user intents reflected by the bundles in the generation process, resulting in less intelligible bundles. This paper addresses these limitations through the exploration of two interrelated tasks, i.e., personalized bundle generation and the underlying intent inference, based on different user sessions. Inspired by the reasoning capabilities of large language models (LLMs), we propose an adaptive in-context learning paradigm, which allows LLMs to draw tailored lessons from related sessions as demonstrations, enhancing the performance on target sessions. Specifically, we first employ retrieval augmented generation to identify nearest neighbor sessions, and then carefully design prompts to guide LLMs in executing both tasks on these neighbor sessions. To tackle reliability and hallucination challenges, we further introduce (1) a self-correction strategy promoting mutual improvements of the two tasks without supervision signals and (2) an auto-feedback mechanism for adaptive supervision based on the distinct mistakes made by LLMs on different neighbor sessions. Thereby, the target session can gain customized lessons for improved performance by observing the demonstrations of its neighbor sessions. Experiments on three real-world datasets demonstrate the effectiveness of our proposed method. ...

Challenges and Future Directions for Integration of Large Language Models into Socio-technical Systems

Journal article (2024) - H. Torkamaan, S. Steinert, Martijn Warnier, F.M. Brazier, O. Oviedo-Trespalacios, M.S. Pera, O. Kudina, S. Kernan Freire, H. Verma, Sage Kelly, M.T. Sekwenz, J. Yang, K.L.L. van Nunen

Large Language Models (LLMs) are expected to significantly impact various socio-technical systems, offering transformative possibilities for improved interaction between humans and technology. However, their integration poses complex challenges due to the intricate interplay between societal structures, human behaviour, and technological innovation. This research explores these multifaceted challenges, emphasising the need for a human-centered approach in integrating LLMs to ensure that technological advancements are aligned with ethical standards and societal needs. Utilizing a structured methodology comprising a workshop, literature analysis, and expert collaborations, the study uses a multi-dimensional human-centered AI framework to guide the responsible integration of LLMs. Key insights include the importance of inclusive data, considering unintended consequences, maintaining privacy, and respecting intellectual property rights. The paper identifies and advocates for principles like human-in-the-loop, continuous longitudinal studies, proactive awareness campaigns, and regular audits to develop LLMs that are ethically sound, adaptable, and effectively integrated into various socio-technical systems, thus addressing user needs and broader societal impacts. The paper also underlines the importance of collaboration among academia, industry, and policymakers to develop LLMs that are ethically aligned, socially beneficial, and adaptable to future societal needs. The findings offer valuable insights into the strategic integration of LLMs, advocating for a broader research perspective beyond industrial motivations to fully understand and leverage LLMs in socio-technical landscapes. ...

XCrowd

Combining Explainability and Crowdsourcing to Diagnose Models in Relation Extraction

Conference paper (2024) - Alisa Smirnova, Jie Yang, Philippe Cudre-Mauroux

Relation extraction methods are currently dominated by deep neural models, which capture complex statistical patterns while being brittle and vulnerable to perturbations in data and distribution. Explainability techniques offer a means for understanding such vulnerabilities, and thus represent an opportunity to mitigate future errors; yet, existing methods are limited to describing what the model 'knows', while totally failing at explaining what the model does not know. This paper presents a new method for diagnosing model predictions and detecting potential inaccuracies. Our approach involves breaking down the problem into two components: (i) determining the necessary knowledge the model should possess for accurate prediction, through human annotations, and (ii) assessing the actual knowledge possessed by the model, using explainable AI methods (XAI). We apply our method to several relation extraction tasks and conduct an empirical study leveraging human specifications of what a model should know and does not know. Results show that human workers are capable of accurately specifying the model should-knows, despite variations in the specification, that the alignment between what a model really knows and what it should know is indeed indicative of model accuracy, and that the unknowns identified through our methods allow to foresee future errors that may or may not have been observed otherwise. ...

Opening the Analogical Portal to Explainability

Can Analogies Help Laypeople in AI-assisted Decision Making?

Journal article (2024) - Gaole He, Agathe Balayn, Stefan Buijsman, Jie Yang, Ujwal Gadiraju

Concepts are an important construct in semantics, based on which humans understand the world with various levels of abstraction. With the recent advances in explainable artificial intelligence (XAI), concept-level explanations are receiving an increasing amount of attention from the broad research community. However, laypeople may find such explanations difficult to digest due to the potential knowledge gap and the concomitant cognitive load. Inspired by prior work that has explored analogies and sensemaking, we argue that augmenting concept-level explanations with analogical inference information from commonsense knowledge can be a potential solution to tackle this issue. To investigate the validity of our proposition, we first designed an effective analogy-based explanation generation method and collected 600 analogy-based explanations from 100 crowd workers. Next, we proposed a set of structured dimensions for the qualitative assessment of such explanations, and conducted an empirical evaluation of the generated analogies with experts. Our findings revealed significant positive correlations between the qualitative dimensions of analogies and the perceived helpfulness of analogy-based explanations, suggesting the effectiveness of the dimensions. To understand the practical utility and the effectiveness of analogybased explanations in assisting human decision-making, we conducted a follow-up empirical study (N = 280) on a skin cancer detection task with non-expert humans and an imperfect AI system. Thus, we designed a between-subjects study spanning five different experimental conditions with varying types of explanations. The results of our study confirmed that a knowledge gap can prevent participants from understanding concept-level explanations. Consequently, when only the target domain of our designed analogy-based explanation was provided (in a specific experimental condition), participants demonstrated relatively more appropriate reliance on the AI system. In contrast to our expectations, we found that analogies were not effective in fostering appropriate reliance. We carried out a qualitative analysis of the open-ended responses from participants in the study regarding their perceived usefulness of explanations and analogies. Our findings suggest that human intuition and the perceived plausibility of analogies may have played a role in affecting user reliance on the AI system. We also found that the understanding of commonsense explanations varied with the varying experience of the recipient user, which points out the need for further work on personalization when leveraging commonsense explanations. In summary, although we did not find quantitative support for our hypotheses around the benefits of using analogies, we found considerable qualitative evidence suggesting the potential of high-quality analogies in aiding non-expert users in their decision making with AI-assistance. These insights can inform the design of future methods for the generation and use of effective analogy-based explanations. ...

Concepts are an important construct in semantics, based on which humans understand the world with various levels of abstraction. With the recent advances in explainable artificial intelligence (XAI), concept-level explanations are receiving an increasing amount of attention from the broad research community. However, laypeople may find such explanations difficult to digest due to the potential knowledge gap and the concomitant cognitive load. Inspired by prior work that has explored analogies and sensemaking, we argue that augmenting concept-level explanations with analogical inference information from commonsense knowledge can be a potential solution to tackle this issue. To investigate the validity of our proposition, we first designed an effective analogy-based explanation generation method and collected 600 analogy-based explanations from 100 crowd workers. Next, we proposed a set of structured dimensions for the qualitative assessment of such explanations, and conducted an empirical evaluation of the generated analogies with experts. Our findings revealed significant positive correlations between the qualitative dimensions of analogies and the perceived helpfulness of analogy-based explanations, suggesting the effectiveness of the dimensions. To understand the practical utility and the effectiveness of analogybased explanations in assisting human decision-making, we conducted a follow-up empirical study (N = 280) on a skin cancer detection task with non-expert humans and an imperfect AI system. Thus, we designed a between-subjects study spanning five different experimental conditions with varying types of explanations. The results of our study confirmed that a knowledge gap can prevent participants from understanding concept-level explanations. Consequently, when only the target domain of our designed analogy-based explanation was provided (in a specific experimental condition), participants demonstrated relatively more appropriate reliance on the AI system. In contrast to our expectations, we found that analogies were not effective in fostering appropriate reliance. We carried out a qualitative analysis of the open-ended responses from participants in the study regarding their perceived usefulness of explanations and analogies. Our findings suggest that human intuition and the perceived plausibility of analogies may have played a role in affecting user reliance on the AI system. We also found that the understanding of commonsense explanations varied with the varying experience of the recipient user, which points out the need for further work on personalization when leveraging commonsense explanations. In summary, although we did not find quantitative support for our hypotheses around the benefits of using analogies, we found considerable qualitative evidence suggesting the potential of high-quality analogies in aiding non-expert users in their decision making with AI-assistance. These insights can inform the design of future methods for the generation and use of effective analogy-based explanations.

Centimeter-Level Indoor Visible Light Positioning

Journal article (2024) - Ran Zhu, Maxim Van Den Abeele, Jona Beysens, Jie Yang, Qing Wang

Visible light positioning (VLP) based on the received signal strength (RSS) can leverage a dense deployment of LEDs in future lighting infrastructure to provide accurate and energy-efficient indoor positioning. However, its positioning accuracy heavily depends on the density of collected fingerprints, which is labor-intensive. In this work, we propose a data pre-processing method, including data cleaning and data augmentation, to construct reliable and dense fingerprint samples, thereby alleviating the impact of noisy samples as well as reducing labor intensity. Extensive experiments demonstrate that our proposed method achieves an average positioning error of 1.7 cm, utilizing a sparse dataset that reduces the fingerprint collection effort by 98 percent. Running a tinyML-based model for VLP on the Arduino Nano microcontroller, we also show the possibilities for deploying RSS fingerprint-based VLP systems on resource-constrained embedded devices for real-world applications. ...

How do you feel?

Measuring User-Perceived Value for Rejecting Machine Decisions in Hate Speech Detection

Conference paper (2023) - Philippe Lammerts, Philip Lippmann, Yen Chia Hsu, Fabio Casati, Jie Yang

Hate speech moderation remains a challenging task for social media platforms. Human-AI collaborative systems offer the potential to combine the strengths of humans' reliability and the scalability of machine learning to tackle this issue effectively. While methods for task handover in human-AI collaboration exist that consider the costs of incorrect predictions, insufficient attention has been paid to accurately estimating these costs. In this work, we propose a value-sensitive rejection mechanism that automatically rejects machine decisions for human moderation based on users' value perceptions regarding machine decisions. We conduct a crowdsourced survey study with 160 participants to evaluate their perception of correct and incorrect machine decisions in the domain of hate speech detection, as well as occurrences where the system rejects making a prediction. Here, we introduce Magnitude Estimation, an unbounded scale, as the preferred method for measuring user (dis)agreement with machine decisions. Our results show that Magnitude Estimation can provide a reliable measurement of participants' perception of machine decisions. By integrating user-perceived value into human-AI collaboration, we further show that it can guide us in 1) determining when to accept or reject machine decisions to obtain the optimal total value a model can deliver and 2) selecting better classification models as compared to the more widely used target of model accuracy. ...