B.J.W. Dudzik
Please Note
19 records found
1
From human teams to hybrid intelligence teams
Identifying, characterizing, and evaluating foundational quality attributes
Hybrid Intelligence (HI) is an emerging paradigm in which artificial intelligence (AI) augments human intelligence. The current literature lacks systematic models that guide the design and evaluation of HI systems. Further, discussions around HI primarily focus on technology, neglecting the holistic human-AI ensemble. In this paper, we take the initial steps toward the development of a quality model for characterizing and evaluating HI systems from a human-AI teams perspective. We first conducted a study investigating the adequacy of properties commonly associated with effective human teams to describe HI. The study features the insights of 50 HI researchers, and shows that various human team properties, including boundedness, interdependence, competency, purposefulness, initiative, normativity, and effectiveness, are important for HI systems. Based on these results, we developed a quality model for HI teams composed of seven high-level quality attributes, further refined into 16 specific ones. To evaluate the relevance and understanding of the proposed attributes, we conducted a second empirical investigation by staging competitions in which participants used the quality model to develop and analyze HI usage scenarios. Our analysis of 48 collected scenarios, which we openly release, confirms the proposed attributes’ relevance and highlights insights that emerge when designers consider the quality model in HI system design.
Partners among Strangers
A social Relations perspective on personality and collaborative partner preferences in first encounters
Collaborative partnerships are often formed following a first encounter. For example, unacquainted individuals may collaborate to complete a project, develop a product, or solve a problem. Using the Social Relations Model, this study examined the extent to which first-encounter trait perceptions predicted collaborative partner preferences. Previously-unacquainted participants (N = 297, 55 groups, 55.9% female) interacted dyadically and provided round-robin ratings of extraversion, honesty-humility, competence, and partner preference. At the target level, individuals who were consistently viewed as extraverted and competent were consistently preferred more as partners. At the relationship level, individuals who were uniquely viewed as honest-humble and competent were uniquely preferred more as partners. Findings underscore the relevance of target- and relationship-specific perceptions in predicting first-encounter collaborative partner preferences.
PARSEL
A Multimodal Dataset for Modeling Decision-Making Processes Involved in Selecting Partners for Joint Tasks
How people evaluate, select, and engage with others in cooperative settings significantly impacts their well-being, happiness, and success. However, navigating these processes is complex. Equipping systems with the ability to recognize, interpret, and even engage during such socio-cognitive processes can increase their potential to support humans in these socio-cognitive processes and be more successful in adjusting to the social environment they are embedded in (e.g., understanding human preferences and attitudes), leading to better quality interactions and decision-making for future partners. Yet, the developments of such systems depend on available datasets. However, based on our knowledge, no dataset exists that can be used to model partner selection for joint tasks. To support research focused on creating such intelligent systems, we introduce the PARSEL dataset – a comprehensive corpus of dyadic interactions designed for computational modeling of PARtner SELection processes and collaborative behavior. In total, 297 participants took part in the datasets. The dataset contains measurements of partner selection decisions over three different stages, as well as factors that may influence partner selection in the context of (online) social interactions. It includes audiovisual recordings that offer fine-grained behavioral cues used during these interactions, self-reported traits, and reported perceptions of person-, situation- and team-specific phenomena. By providing this resource, we aim to foster advancements in computational methods that can effectively model and augment socio-cognitive processes, contributing to socially aware intelligent systems and enhanced human-system interactions.
Technologies Supporting Self-Reflection on Social Interactions
A Systematic Review
As intelligent technology and applications have become an integral part of nearly all aspects of people's daily lives, many intelligent systems have been designed to help people navigate the complex space of social interactions. One prominent strategy for such intelligent support is providing meaningful Ad Hoc Interventions (ADI), e.g., through timely notifications. An alternative is Technology-Supported Reflection (TSR), e.g., by offering information about activities in one's past for personal insights. In contrast to straight-up interventions, the aim of the latter strategy is not to directly augment human skills but instead support learning and personal growth over time. However, while TSR has seen widespread interest in applications in some areas, such as physical fitness and mental health, its use for improving human social interactions has not yet been systematically explored. Concretely, it is currently unclear 1) what forms of self-reflection systems intend to support, 2) how their different technological components (e.g., data collection, information integration) are involved in providing support, and 3) what common limitations and design challenges they face. In this article, we present the results of a systematic literature review focusing on these questions to provide a structured foundation for targeted research. Concretely, we identified and analysed a collection of 23 relevant papers, each describing a system deploying TSR to support humans with elements of social interactions.We constructed a framework with a set of features to comprehensively describe and analyze the systems that support self-reflection, including their application domains, how they fit into the existing design framework, how they facilitate learning through reflection, how adaptive they are to individual users, and how they were evaluated. Finally, we propose a direction for designing systems that support individual's social interactions through self-reflection in an adaptive manner.
The OpenVIMO Platform
A Tutorial on Building and Managing Large-scale Online Experiments involving Videoconferencing
Online experiments leveraging video conferencing offer significant advantages for studying human social interactions, including enhanced participant diversity and scalability. However, challenges include complex adjustments, privacy risks, software requirements, limited customization, and remote participant management. openVIMO is a software framework for creating video-based online interaction experiments built to address these challenges. It operates on open-source technologies, allowing deployment on researchers' servers without relying on third-party services. openVIMO facilitates the creation of web-based experiments with video calls, live monitoring of participants' progress or interactions, and comprehensive data collection, including audio and video. Moreover, it supports highly customizable experimental protocols and dynamic expansion by the research community. This tutorial will describe the functionality and design rationale underlying openVIMO. Moreover, it will outline possible application scenarios and provide examples for developing and managing studies using the platform. Finally, attendees will gain hands-on experience implementing a small-scale study of their design.
The ability to automatically infer relevant aspects of human users' thoughts and feelings is crucial for technologies to intelligently adapt their behaviors in complex interactions. Research on multimodal analysis has demonstrated the potential of technology to provide such estimates for a broad range of internal states and processes. However, constructing robust approaches for deployment in real-world applications remains an open problem. The MSECP-Wild workshop series is a multidisciplinary forum to present and discuss research addressing this challenge. Submissions to this 5th iteration span efforts relevant to multimodal data collection, modeling, and applications. In addition, our workshop program builds on discussions emerging in previous iterations, highlighting ethical considerations when building and deploying technology modeling internal states in the wild. For this purpose, we host a range of relevant keynote speakers and interactive activities.
Collecting Mementos
A Multimodal Dataset for Context-Sensitive Modeling of Affect and Memory Processing in Responses to Videos
In this article we introduce Mementos: the first multimodal corpus for computational modeling of affect and memory processing in response to video content. It was collected online via crowdsourcing and captures 1995 individual responses collected from 297 unique viewers responding to 42 different segments of music videos. Apart from webcam recordings of their upper-body behavior (totaling 2012 minutes) and self-reports of their emotional experience, it contains detailed descriptions of the occurrence and content of 989 personal memories triggered by the video content. Finally, the dataset includes self-report measures related to individual differences in participants' background and situation (Demographics, Personality, and Mood), thereby facilitating the exploration of important contextual factors in research using the dataset. We describe 1) the construction and contents of the corpus itself, 2) analyse the validity of its content by investigating biases and consistency with existing research on affect and memory processing, 3) review previously published work that demonstrates the usefulness of the multimodal data in the corpus for research on automated detection and prediction tasks, and 4) provide suggestions for how the dataset can be used in future research on modeling Video-Induced Emotions, Memory-Associated Affect, and Memory Evocation.
Intelligent systems might benefit from automatically detecting when a stimulus has triggered a user's recollection of personal memories, e.g., to identify that a piece of media content holds personal significance for them. While computational research has demonstrated the potential to identify related states based on facial behavior (e.g., mind-wandering), the automatic detection of spontaneous recollections specifically has not been investigated this far. Motivated by this, we present machine learning experiments exploring the feasibility of detecting whether a video clip has triggered personal memories in a viewer based on the analysis of their Head Rotation, Head Position, Eye Gaze, and Facial Expressions. Concretely, we introduce an approach for automatic detection and evaluate its potential for predictions using in-the-wild webcam recordings. Overall, our findings demonstrate the capacity for above chance detections in both settings, with substantially better performance for the video-independent variant. Beyond this, we investigate the role of person-specific recollection biases for predictions of our video-independent models and the importance of specific modalities of facial behavior. Finally, we discuss the implications of our findings for detecting recollections and user-modeling in adaptive systems.
In competitive multiplayer online video games, teamwork is of utmost importance, implying high levels of interdependence between the joint outcomes of players. When engaging in such interdependent interactions, humans rely on trust to facilitate coordination of their individual behaviours. However, online games often take place between teams of strangers, with individual members having little to no information about each other than what they observe throughout the interaction itself. A better understanding of the social behaviours that are used by players to form trust could not only facilitate richer gaming experiences, but could also lead to insights about team interactions. As such, this paper presents a first step towards understanding how and which types of in-game behaviour relate to trust formation. In particular, we investigate a) which in-game behaviour were relevant for trust formation (first part of the study) and b) how they relate to the reported player's trust in their teammates (the second part of the study). The first part consisted of interviews with League of Legends players in order to create a taxonomy of in-game behaviours relevant for trust formation. As for the second part, we ran a small-scale pilot study where participants played the game and then answered a questionnaire to measure their trust in their teammates. Our preliminary results present a taxonomy of in-game behaviours which can be used to annotate the games regarding trust behaviours. Based on the pilot study, the list of behaviours could be extended as to improve the results. These findings can be used to research the role of trust formation in teamwork.
The ability to automatically infer relevant aspects of human users' thoughts and feelings is crucial for technologies to adapt their behaviors in complex interactions intelligently (e.g., social robots or tutoring systems). Research on multimodal analysis has demonstrated the potential of technology to provide such estimates for a broad range of internal states and processes. However, constructing robust enough approaches for deployment in real-world applications remains an open problem. The MSECP-Wild workshop series serves as a multidisciplinary forum to present and discuss research addressing this challenge. This 4th iteration focuses on addressing varying contextual conditions (e.g., throughout an interaction or across different situations and environments) in intelligent systems as a crucial barrier for more valid real-world predictions and actions. Submissions to the workshop span efforts relevant to multimodal data collection and context-sensitive modeling. These works provide important impulses for discussions of the state-of-the-art and opportunities for future research on these subjects.
Enabling computer-based applications to display intelligent behavior in complex social settings requires them to relate to important aspects of how humans experience and understand such situations. One crucial driver of peoples' social behavior during an interaction is the interdependence they perceive, i.e., how the outcome of an interaction is determined by their own and others' actions. According to psychological studies, both the nonverbal behavior displayed by Motivated by this, we present a series of experiments to automatically recognize interdependence perceptions in dyadic face-to-face negotiations using these sources. Concretely, our approach draws on a combination of features describing individuals' Facial, Upper Body, and Vocal Behavior with state-of-the-art algorithms for multivariate time series classification. Our findings demonstrate that differences in some types of interdependence perceptions can be detected through the automatic analysis of nonverbal behaviors. We discuss implications for developing socially intelligent systems and opportunities for future research.
Towards Artificial Empathic Memory
Accounting for the Influence of Personal Memories in Automatic Predictions of Affect
Empirical evidence suggests that the emotional meaning of facial behavior in isolation is often ambiguous in real-world conditions. While humans complement interpretations of others' faces with additional reasoning about context, automated approaches rarely display such context-sensitivity. Empirical findings indicate that the personal memories triggered by videos are crucial for predicting viewers' emotional response to such videos ?- in some cases, even more so than the video's audiovisual content. In this article, we explore the benefits of personal memories as context for facial behavior analysis. We conduct a series of multimodal machine learning experiments combining the automatic analysis of video-viewers' faces with that of two types of context information for affective predictions: \beginenumerate∗[label=(\arabic∗)] \item self-reported free-text descriptions of triggered memories and \item a video's audiovisual content \endenumerate∗. Our results demonstrate that both sources of context provide models with information about variation in viewers' affective responses that complement facial analysis and each other.
Context in Human Emotion Perception for Automatic Affect Detection
A Survey of Audiovisual Databases
An important aspect of human emotion perception is the use of contextual information to understand others' feelings even in situations where their behavior is not very expressive or has an emotionally ambiguous meaning. For technology to successfully detect affect, it must mimic this human ability when analyzing audiovisual input. Databases upon which machine learning algorithms are trained should capture the context of social interactions as well as the behavior expressed in them. However, there is a lack of consensus about what constitutes relevant context in such databases. In this article, we make two contributions towards overcoming this challenge: (a) we identify two principal sources of context for emotion perceptions based on psychological theory, and (b) we provide an overview of how each of these has been considered in published databases covering social interactions. Our results show that a similar set of contextual features are present across the reviewed databases. Between all the different databases researchers seem to have taken into account a set of contextual features reflecting the sources of context seen in psychological theory. However, within individual databases, these features are not yet systematically varied. This is problematic because it prevents them from being used directly as resources for the modeling of context-sensitive affect detection. Based on our findings, we suggest improvements for the future development of affective databases.
Artificial Empathic Memory
Enabling Media Technologies to Better Understand Subjective User Experience
includes interactions with media technologies. However, current approaches for personalizing interactions with these technologies are neither aware of what episodic memories are triggered in users, nor of their emotional interpretations of those memories. We argue that this is a serious limitation, because it prevents applications from correctly estimating users’ experiences. In short, such technologies lack empathy. In this position paper, we argue that media technologies need an Artificial Empathic Memory (AEM) of their users to address this issue. We propose a psychologically inspired architecture, examine the challenges to be solved, and highlight how existing research can become a starting point for overcoming them. ...
includes interactions with media technologies. However, current approaches for personalizing interactions with these technologies are neither aware of what episodic memories are triggered in users, nor of their emotional interpretations of those memories. We argue that this is a serious limitation, because it prevents applications from correctly estimating users’ experiences. In short, such technologies lack empathy. In this position paper, we argue that media technologies need an Artificial Empathic Memory (AEM) of their users to address this issue. We propose a psychologically inspired architecture, examine the challenges to be solved, and highlight how existing research can become a starting point for overcoming them.
challenge to retrace this subjective organization in sensor data
referencing objective time. Lifelogging is a specific approach to the ubiquitous monitoring of individuals that can contribute to overcoming this recollection gap. It strives to create a comprehensive timeline of semantic annotations that reflect the impressions of the monitored person from his or her own subjective point-of-view. In this paper, we describe a novel approach for processing such lifelogs to situate remembered experiences in an objective timeline. It involves the computational modeling of individuals’ memory processes to estimate segments within a lifelog acting as plausible digital representations for their recollections. We report about an empirical investigation in which we use our approach to discover plausible representations for remembered social interactions between participants in a longitudinal study. In particular, we describe an exploration of the behavior displayed by our model for memory processes in this setting. Finally, we explore the representations discovered for this study and discuss insights that might be gained from them. ...
challenge to retrace this subjective organization in sensor data
referencing objective time. Lifelogging is a specific approach to the ubiquitous monitoring of individuals that can contribute to overcoming this recollection gap. It strives to create a comprehensive timeline of semantic annotations that reflect the impressions of the monitored person from his or her own subjective point-of-view. In this paper, we describe a novel approach for processing such lifelogs to situate remembered experiences in an objective timeline. It involves the computational modeling of individuals’ memory processes to estimate segments within a lifelog acting as plausible digital representations for their recollections. We report about an empirical investigation in which we use our approach to discover plausible representations for remembered social interactions between participants in a longitudinal study. In particular, we describe an exploration of the behavior displayed by our model for memory processes in this setting. Finally, we explore the representations discovered for this study and discuss insights that might be gained from them.