C.M. Jonker
Please Note
208 records found
1
From human teams to hybrid intelligence teams
Identifying, characterizing, and evaluating foundational quality attributes
Hybrid Intelligence (HI) is an emerging paradigm in which artificial intelligence (AI) augments human intelligence. The current literature lacks systematic models that guide the design and evaluation of HI systems. Further, discussions around HI primarily focus on technology, neglecting the holistic human-AI ensemble. In this paper, we take the initial steps toward the development of a quality model for characterizing and evaluating HI systems from a human-AI teams perspective. We first conducted a study investigating the adequacy of properties commonly associated with effective human teams to describe HI. The study features the insights of 50 HI researchers, and shows that various human team properties, including boundedness, interdependence, competency, purposefulness, initiative, normativity, and effectiveness, are important for HI systems. Based on these results, we developed a quality model for HI teams composed of seven high-level quality attributes, further refined into 16 specific ones. To evaluate the relevance and understanding of the proposed attributes, we conducted a second empirical investigation by staging competitions in which participants used the quality model to develop and analyze HI usage scenarios. Our analysis of 48 collected scenarios, which we openly release, confirms the proposed attributes’ relevance and highlights insights that emerge when designers consider the quality model in HI system design.
"even explanations will not help in trusting [this] fundamentally biased system"
A Predictive Policing Case-Study
In today's society, where Artificial Intelligence (AI) has gained a vital role, concerns regarding user's trust have garnered significant attention. The use of AI systems in high-risk domains have often led users to either under-trust it, potentially causing inadequate reliance or over-trust it, resulting in over-compliance. Therefore, users must maintain an appropriate level of trust. Past research has indicated that explanations provided by AI systems can enhance user understanding of when to trust or not trust the system. However, the utility of presentation of different explanations forms still remains to be explored especially in high-risk domains. Therefore, this study explores the impact of different explanation types (text, visual, and hybrid) and user expertise (retired police officers and lay users) on establishing appropriate trust in AI-based predictive policing. While we observed that the hybrid form of explanations increased the subjective trust in AI for expert users, it did not led to better decision-making. Furthermore, no form of explanations helped build appropriate trust. The findings of our study emphasize the importance of re-evaluating the use of explanations to build [appropriate] trust in AI based systems especially when the system's use is questionable. Finally, we synthesize potential challenges and policy recommendations based on our results to design for appropriate trust in high-risk based AI-based systems.
Knowing Me, Knowing AU
How Should We Design Agent-Mediated Mimicry?
A lack of self-awareness of communicative behaviours can lead to disadvantages in important interactions. Video recordings as a tool for self-observation have been widely adopted to initiate behaviour change and reflection. Seeing oneself in a recording can lead to negative affect. Forcing an external perspective can lead to cognitive dissonance. Avatars and virtual agents have the advantage that they can copy a human's behaviour while potentially avoiding this dissonance. To explore the design space of mimicking agents, we set up a user study where a video baseline is compared to agent-mediated conditions ranging from idle non-verbal behaviour to complete mimicry of the voice and face. We show that participants gain increased self-awareness from seeing themselves mediated through the virtual agent. We further discuss qualitative observations for the future design of systems that aid in self-reflection, and particularly note that partial mimicry seems to be less appreciated than full mimicry.
Commissioning for integration
Exploring the dynamics of the “subsidy tables” approach in Dutch social care delivery
Purpose: The objective of this paper is to develop a redesigned commissioning process for social care services that fosters integrated care, encourages collaboration and balances professional expertise with client engagement. Design/methodology/approach: This study employs a two-pronged approach: a case study of a municipality’s use of subsidy tables and a literature scoping review on integrated care research. Findings: The paper introduces a new framework for the study of the new “subsidy tables.” A well-defined and extensive consultation process involving both social care providers (suppliers), the Service Triad, and client representation adds to the existing research on supplier consultation, and on how to define the outcomes for clients via client engagement. Research limitations/implications: While aspects are clearly relevant to the Netherlands, the design of the commissioning process of social care has international relevance as well: finding definitions, formulating outcomes and incentives, designing a more collaborative instead of competitive process, stakeholder engagement and consultation. Practical implications: Several Dutch municipalities started using the “subsidy tables” method for commissioning integrated social care. This paper offers clear improvements that benefit the commissioners, the social care providers and their clients. Social implications: Improving the commissioning process of integrated social care will lead to better fitting care for people who need social care. Originality/value: This paper is one of the first to do a thorough analysis of the “subsidy tables” method for commissioning integrated social care.
Appropriate trust is an important component of the interaction between people and AI systems, in that "inappropriate"trust can cause disuse, misuse, or abuse of AI. To foster appropriate trust in AI, we need to understand how AI systems can elicit appropriate levels of trust from their users. Out of the aspects that influence trust, this article focuses on the effect of showing integrity. In particular, this article presents a study of how different integrity-based explanations made by an AI agent affect the appropriateness of trust of a human in that agent. To explore this, (1) we provide a formal definition to measure appropriate trust, (2) present a between-subject user study with 160 participants who collaborated with an AI agent in such a task. In the study, the AI agent assisted its human partner in estimating calories on a food plate by expressing its integrity through explanations focusing on either honesty, transparency, or fairness. Our results show that (a) an agent who displays its integrity by being explicit about potential biases in data or algorithms achieved appropriate trust more often compared to being honest about capability or transparent about the decision-making process, and (b) subjective trust builds up and recovers better with honesty-like integrity explanations. Our results contribute to the design of agent-based AI systems that guide humans to appropriately trust them, a formal method to measure appropriate trust, and how to support humans in calibrating their trust in AI.
We adopt an emerging and prominent vision of human-centred Artificial Intelligence that requires building trustworthy intelligent systems. Such systems should be capable of dealing with the challenges of an interconnected, globalised world by handling plurality and by abiding by human values. Within this vision, pluralistic value alignment is a core problem for AI– that is, the challenge of creating AI systems that align with a set of diverse individual value systems. So far, most literature on value alignment has considered alignment to a single value system. To address this research gap, we propose a novel method for estimating and aggregating multiple individual value systems. We rely on recent results in the social choice literature and formalise the value system aggregation problem as an optimisation problem. We then cast this problem as an ℓp-regression problem. Doing so provides a principled and general theoretical framework to model and solve the aggregation problem. Our aggregation method allows us to consider a range of ethical principles, from utilitarian (maximum utility) to egalitarian (maximum fairness). We illustrate the aggregation of value systems by considering real-world data from two case studies: the Participatory Value Evaluation process and the European Values Study. Our experimental evaluation shows how different consensus value systems can be obtained depending on the ethical principle of choice, leading to practical insights for a decision-maker on how to perform value system aggregation.
Epistemic logic can be used to reason about statements such as ‘I know that you know that I know that φ ’. In this logic, and its extensions, it is commonly assumed that agents can reason about epistemic statements of arbitrary nesting depth. In contrast, empirical findings on Theory of Mind, the ability to (recursively) reason about mental states of others, show that human recursive reasoning capability has an upper bound. In the present paper we work towards resolving this disparity by proposing some elements of a logic of bounded Theory of Mind, built on Public Announcement Logic. Using this logic, and a statistical method called Random-Effects Bayesian Model Selection, we estimate the distribution of Theory of Mind levels in the participant population of a previous behavioral experiment. Despite not modeling stochastic behavior, we find that approximately three-quarters of participants’ decisions can be described using Theory of Mind. In contrast to previous empirical research, our models estimate the majority of participants to be second-order Theory of Mind users.
NegoLog
An Integrated Python-based Automated Negotiation Framework with Enhanced Assessment Components
The complexity of automated negotiation research calls for dedicated, user-friendly research frameworks that facilitate advanced analytics, comprehensive loggers, visualization tools, and auto-generated domains and preference profiles. This paper introduces NegoLog, a platform that provides advanced and customizable analysis modules to agent developers for exhaustive performance evaluation. NegoLog introduces an automated scenario and tournament generation tool in its Web-based user interface so that the agent developers can adjust the competitiveness and complexity of the negotiations. One of the key novelties of the NegoLog is an individual assessment of preference estimation models independent of the strategies.
Interdependence and trust analysis (ITA)
A framework for human–machine team design
As machines' autonomy increases, the possibilities for collaboration between a human and a machine also increase. In particular, tasks may be performed with varying levels of interdependence, i.e. from independent to joint actions. The feasibility of each type of interdependence depends on factors that contribute to contextual trustworthiness, such as team members' competence, willingness and external factors. In this paper, we present the Interdependence and Trust Analysis (ITA) framework, which is an extension of Coactive Design's Interdependence Analysis framework (Johnson, M., J. M. Bradshaw, P. J. Feltovich, C. M. Jonker, M. Birna Van Riemsdijk, M. Sierhuis. 2014. Coactive Design: Designing Support for Interdependence in Joint Activity. Journal of Human-Robot Interaction 3 (1): 43–69. https://doi.org/10.5898/JHRI.3.1.Johnson). By including information on contextual trustworthiness, ITA can better support the design of human–machine teams, as well as task allocation and selection. Evaluated through expert interviews and a focus group involving a search and rescue scenario, ITA shows potential as a decision-making tool and a communication bridge among human and machine teammates. Our findings emphasise the need to define tasks and roles based on agent characteristics, and imply that decision-making models should align with human-centred objectives. ITA also highlights the trade-off between utility and effort when designing trustworthy systems, suggesting that guided conversations could improve the team design process. Finally, the ITA framework may improve transparency, justification, and interpretability in decision-making, contributing to appropriate trust among teammates.
Large-scale survey tools enable the collection of citizen feedback in opinion corpora. Extracting the key arguments from a large and noisy set of opinions helps in understanding the opinions quickly and accurately. Fully automated methods can extract arguments but (1) require large labeled datasets that induce large annotation costs and (2) work well for known viewpoints, but not for novel points of view. We propose HyEnA, a hybrid (human + AI) method for extracting arguments from opinionated texts, combining the speed of automated processing with the understanding and reasoning capabilities of humans. We evaluate HyEnA on three citizen feedback corpora. We find that, on the one hand, HyEnA achieves higher coverage and precision than a state-of-The-Art automated method when compared to a common set of diverse opinions, justifying the need for human insight. On the other hand, HyEnA requires less human effort and does not compromise quality compared to (fully manual) expert analysis, demonstrating the benefit of combining human and artificial intelligence.