M. Bruijnes
Please Note
14 records found
1
Validating claims and replicating findings on the impact of artificial social agents (ASA), such as virtual agents, conversational agents, and social robots, requires a standardised measurement instrument that researchers can employ in different settings and for various agents. Such an instrument would allow researchers to evaluate their agents and establish insights beyond their specific study context. Therefore, we present the long and short versions of the ASA questionnaire (ASAQ) for evaluating human-ASA interaction on 19 constructs, such as the agent's believability, sociability, and coherence. It has been developed by an international workgroup with more than 100 ASA-researchers over multiple years who identified community-relevant constructs and associated questionnaire items and examined the questionnaire's reliability, validity, and interpretability. The result is a questionnaire that can capture more than 80% of the constructs that studies in the intelligent virtual agent community investigate, with acceptable levels of reliability, content validity, construct validity, and cross-validity. We suggest that ASA-researchers use the ASAQ short version to report their agent's psychographic information and the ASAQ long version to analyse any constructs in-depth that are specifically relevant to their agent or study. Finally, this paper gives instructions for practical use, such as sample size estimations, and how to interpret and present results.
Corrigendum
Mandarin Chinese translation of the Artificial-Social-Agent questionnaire instrument for evaluating human-agent interaction (Frontiers in Computer Science, (2023), 5, (1149305), 10.3389/fcomp.2023.1149305)
In the published article, there was an error in Table 5. For each second construct/dimension, the means are swapped between Chinese and English data, which is caused by an error in the underlying R script. Consequently, the plus and minus signs for the delta and CI values are also wrong. The corrected Table 5 and its caption appear below. Construct/dimension rating difference between mixed-international English-speaking and Chinese mother-tongue groups. Δ Score are pairwise differences between Chinese and mother-tongue cultural background and mixed-international cultural background taken from the posterior distribution. M, mean; SD, standard deviation; CI, credible interval. The authors apologize for this error and state that this does not change the scientific conclusions of the article in any way. The original article has been updated.
Reducing social diabetes distress with a conversational agent support system
A three-week technology feasibility evaluation
Objective: The objective of this work is to determine the feasibility and preliminary efficacy of an automated conversational agent to deliver, to people with diabetes, personalised psycho-education on dealing with (psycho-)social distress related to their chronic illness.
Methods: In a double-blinded between-subject study, 156 crowd-workers with diabetes received a social help program intervention in three sessions over three weeks. They were randomly assigned to receive support from either an interactive conversational support agent (n=79) or a self-help text from the book “Diabetes burnout” as a control condition (n=77). Participants completed the Diabetes Distress Scale (DDS) before and after the intervention, and after the intervention, the Client Satisfaction Questionnaire (CSQ-8), Feeling of Being Heard (FBH), and System Usability Scale (SUS).
Results: Results indicate that people using the conversational agent have a larger reduction in diabetes distress (M=−0.305, SD=0.865) than the control group (M=0.002, SD=0.743) and this difference is statistically significant (t(154)=2.377, p=0.019). A hypothesised mediation effect of “attitude to the social help program” was not observed.
Conclusions: An automated conversational agent can deliver personalised psycho-education on dealing with (psycho-)social distress to people with diabetes and reduce diabetes distress more than a self-help book.
Ethics, Study Registration and Open Science: This study has been preregistered with the Open Science Foundation (osf.io/yb6vg) and has been accepted by the Human Research Ethics Committee - Delft University of Technology under application number 1130. The data and analysis script are available: https://surfdrive.surf.nl/files/index.php/s/4xSEHCrAu0HsJ4P. ...
Objective: The objective of this work is to determine the feasibility and preliminary efficacy of an automated conversational agent to deliver, to people with diabetes, personalised psycho-education on dealing with (psycho-)social distress related to their chronic illness.
Methods: In a double-blinded between-subject study, 156 crowd-workers with diabetes received a social help program intervention in three sessions over three weeks. They were randomly assigned to receive support from either an interactive conversational support agent (n=79) or a self-help text from the book “Diabetes burnout” as a control condition (n=77). Participants completed the Diabetes Distress Scale (DDS) before and after the intervention, and after the intervention, the Client Satisfaction Questionnaire (CSQ-8), Feeling of Being Heard (FBH), and System Usability Scale (SUS).
Results: Results indicate that people using the conversational agent have a larger reduction in diabetes distress (M=−0.305, SD=0.865) than the control group (M=0.002, SD=0.743) and this difference is statistically significant (t(154)=2.377, p=0.019). A hypothesised mediation effect of “attitude to the social help program” was not observed.
Conclusions: An automated conversational agent can deliver personalised psycho-education on dealing with (psycho-)social distress to people with diabetes and reduce diabetes distress more than a self-help book.
Ethics, Study Registration and Open Science: This study has been preregistered with the Open Science Foundation (osf.io/yb6vg) and has been accepted by the Human Research Ethics Committee - Delft University of Technology under application number 1130. The data and analysis script are available: https://surfdrive.surf.nl/files/index.php/s/4xSEHCrAu0HsJ4P.
The multimodal EchoBorg
Not as smart as it looks
In this paper we present a Multimodal Echoborg interface to explore the effect of different embodiments of an Embodied Conversational Agent (ECA) in an interaction. We compared an interaction where the ECA was embodied as a virtual human (VH) with one where it was embodied as an Echoborg, i.e, a person whose actions are covertly controlled by a dialogue system. The Echoborg in our study not only shadowed the speech output of the dialogue system but also its non-verbal actions. The interactions were structured as a debate between three participants on an ethical dilemma. First, we collected a corpus of debate sessions with three humans debaters. This we used as baseline to design and implement our ECAs. For the experiment, we designed two debate conditions. In one the participant interacted with two ECAs both embodied by virtual humans). In the other the participant interacted with one ECA embodied by a VH and the other by an Echoborg. Our results show that a human embodiment of the ECA overall scores better on perceived social attributes of the ECA. In many other respects the Echoborg scores as poorly as the VH except copresence.
The artificial-social-agent questionnaire
Establishing the long and short questionnaire versions
We present the ASA Questionnaire, an instrument for evaluating human interaction with an artificial social agent (ASA), resulting from multi-year efforts involving more than 100 Intelligent Virtual Agent (IVA) researchers worldwide. It has 19 measurement constructs constituted by 90 items, which capture more than 80% of the constructs identified in empirical studies published in the IVA conference 2013 - 2018. This paper reports on construct validity analysis, specifically convergent and discriminant validity of initial 131 instrument items that involved 532 crowd-workers who were asked to rate human interaction with 14 different ASAs. The analysis included several factor analysis models and resulted in the selection of 90 items for inclusion in the long version of the ASA questionnaire. In addition, a representative item of each construct or dimension was selected to create a 24-item short version of the ASA questionnaire. Whereas the long version is suitable for a comprehensive evaluation of human-ASA interaction, the short version allows quick analysis and description of the interaction with the ASA. To support reporting ASA questionnaire results, we also put forward an ASA chart. The chart provides a quick overview of the agent profile.
People seem to hold the human driver to be primarily responsible when their partially automated vehicle crashes, yet is this reasonable? While the driver is often required to immediately take over from the automation when it fails, placing such high expectations on the driver to remain vigilant in partially automated driving is unreasonable. Drivers show difficulties in taking over control when needed immediately, potentially resulting in dangerous situations. From a normative perspective, it would be reasonable to consider the impact of automation on the driver’s ability to take over control when attributing responsibility for a crash. We, therefore, analyzed whether the public indeed considers driver ability when attributing responsibility to the driver, the vehicle, and its manufacturer. Participants blamed the driver primarily, even though they recognized the driver’s decreased ability to avoid the crash. These results portend undesirable situations in which users of partially driving automation are the ones held responsible, which may be unreasonable due to the detrimental impact of driving automation on human drivers. Lastly, the outcome signals that public awareness of such human-factors issues with automated driving should be improved.
Agents United
An Open Platform for Multi-Agent Conversational Systems
The development of applications with intelligent virtual agents (IVA) often comes with integration of multiple complex components. In this article we present the Agents United Platform: an open source platform that researchers and developers can use as a starting point to setup their own multi-IVA applications. The new platform provides developers with a set of integrated components in a sense-remember-think-act architecture. Integrated components are a sensor framework, memory component, Topic Selection Engine, interaction manager (Flipper), two dialogue execution engines, and two behaviour realisers (ASAP and GRETA) of which the agents can seamlessly interact with each other. This article discusses the platform and its individual components. It also highlights some of the novelties that arise from the integration of components and elaborates on directions for future work.
In this paper, we report on the multi-year Intelligent Virtual Agents (IVA) community effort, involving more than 90 researchers worldwide, researching the IVA community interests and practice in evaluating human interaction with an artificial social agent (ASA). The joint efforts have previously generated a unified set of 19 constructs that capture more than 80% of constructs used in empirical studies published in the IVA conference between 2013 to 2018. In this paper, we present expert-content-validated 131 questionnaire items for the constructs and their dimensions, and investigate the level of reliability. We establish this in three phases. Firstly, eight experts generated 431 potential construct items. Secondly, 20 experts rated whether items measure (only) their intended construct, resulting in 207 content-validated items. Next, a reliability analysis was conducted, involving 192 crowd-workers who were asked to rate a human interaction with an ASA, which resulted in 131 items (about 5 items per measurement, with Cronbach's alpha ranged [.60 - .87]). These are the starting points for the questionnaire instrument of human-ASA interaction.
The 19 Unifying Questionnaire Constructs of Artificial Social Agents
An IVA Community Analysis
In this paper, we report on the multi-year Intelligent Virtual Agents (IVA) community effort, involving more than 80 researchers worldwide, researching the IVA community interests and practises in evaluating human interaction with an artificial social agent (ASA). The effort is driven by previous IVA workshops and plenary IVA discussions related to the methodological crisis on the evaluation of ASAs. A previous literature review showed a continuous practise of creating new questionnaires instead of reusing validated questionnaires. We address this issue by examining questionnaire measurement constructs used in empirical studies between 2013 to 2018 published in the IVA conference. We identified 189 constructs used in 89 questionnaires that are reported across 81 studies. Although these constructs have different names, they often measure the same thing. In this paper, we, therefore, present a unifying set of 19 constructs that captures more than 80% of the 189 constructs initially identified. We established this set in two steps. First, 49 researchers classified the constructs in broad theoretically based categories. Next, 23 researchers grouped the constructs in each category on their similarity. The resulting 19 groups form a unifying set of constructs, which will be the basis for the future questionnaire instrument of human-ASA interaction.
Commensality is defined as "a social group that eats together", and eating in a commensality setting has a number of positive effects on humans. The purpose of this paper is to investigate the effects of technology on commensality by presenting an experiment in which a toy robot showing non-verbal social behaviours tries to influence a participants' food choice and food taste perception. We managed to conduct both a qualitative and quantitative study with 10 participants. Results show the favourable impression of the robot on participants. It also emerged that the robot may be able to influence the food choices using its non-verbal behaviors only. However, these results are not statistically significant, perhaps due to the small sample size. In the future, we plan to collect more data using the same experimental protocol, and to verify these preliminary results.
Research into artificial social agents aims at constructing these agents and at establishing an empirically grounded understanding of them, their interaction with humans, and how they can ultimately deliver certain outcomes in areas such as health, entertainment, and education. Key for establishing such understanding is the community’s ability to describe and replicate their observations on how users perceive and interact with their agents. In this paper, we address this ability by examining questionnaires and their constructs used in empirical studies reported in the intelligent virtual agent conference proceedings from 2013 to 2018. The literature survey shows the identification of 189 constructs used in 89 questionnaires that were reported across 81 papers. We found unexpectedly little repeated use of questionnaires as the vast majority of questionnaires (more than 76%) were only reported in a single paper. We expect that this finding will motivate joint effort by the IVA community towards creating a unified measurement instrument and in the broader AI community a renewed interest in replicability of our (user) studies.
What are we measuring anyway?
-A literature survey of questionnaires used in studies reported in the intelligent virtual agent conferences
Research into artificial social agents aims at constructing these agents and at establishing an empirically grounded understanding of them, their interaction with humans, and howthey can ultimately deliver certain outcomes in areas such as health, entertainment, and education. Key for establishing such understanding is the community's ability to describe and replicate their observations on how users perceive and interact with their agents. In this paper, we address this ability by examining questionnaires and their constructs used in empirical studies reported in the intelligent virtual agent conference proceedings from 2013 to 2018. The literature survey shows the identification of 189 constructs used in 89 questionnaires thatwere reported across 81 papers.We found unexpectedly little repeated use of questionnaires as the vast majority of questionnaires (more than 76%) were only reported in a single paper. We expect that this finding will motivate joint effort by the IVA community towards creating a unified measurement instrument.