S. Qiu | TU Delft Repository

HealthInsights

An Online Conversational Survey for Understanding Worker Health in Crowdsourcing Platforms

Conference paper (2025) - Sihang Qiu, Ujwal Gadiraju, Xiaolong Zheng

Crowdsourcing marketplaces have gradually flourished over the last decade. With the growing landscape of online work in general, and the rise of paid microtask crowdsourcing in particular, the health and wellbeing of crowd workers has become an important concern. In this paper, we present an online conversational survey, named HealthInsights, for understanding the status quo of workers’ health-related background, physical health, mental health, and their needs. We carried out a study on two popular platforms - Mechanical Turk and Prolific. Results show that the survey has acceptable reliability and validity. We found that workers across these platforms reported similar health-related issues, but also exhibited certain differences. Based on our findings, we argue that crowdsourcing platforms, task requesters, and academic researchers need to take the collective responsibility of creating better work environments. Our work has important implications on task and workflow design that are centered around worker health on crowdsourcing platforms. ...

Perspective

Leveraging Human Understanding for Identifying and Characterizing Image Atypicality

Conference paper (2023) - Shahin Sharifi Noorian, Sihang Qiu, Burcu Sayin, Agathe Balayn, Ujwal Gadiraju, Jie Yang, Alessandro Bozzon

High-quality data plays a vital role in developing reliable image classification models. Despite that, what makes an image difficult to classify remains an unstudied topic. This paper provides a first-of-its-kind, model-agnostic characterization of image atypicality based on human understanding. We consider the setting of image classification "in the wild", where a large number of unlabeled images are accessible, and introduce a scalable and effective human computation approach for proactive identification and characterization of atypical images. Our approach consists of i) an image atypicality identification and characterization task that presents to the human worker both a local view of visually similar images and a global view of images from the class of interest and ii) an automatic image sampling method that selects a diverse set of atypical images based on both visual and semantic features. We demonstrate the effectiveness and cost-efficiency of our approach through controlled crowdsourcing experiments and provide a characterization of image atypicality based on human annotations of 10K images. We showcase the utility of the identified atypical images by testing state-of-the-art image classification services against such images and provide an in-depth comparative analysis of the alignment between human- and machine-perceived image atypicality. Our findings have important implications for developing and deploying reliable image classification systems. ...

Great Chain of Agents

The Role of Metaphorical Representation of Agents in Conversational Crowdsourcing

Conference paper (2022) - Ji Youn Jung, Sihang Qiu, Alessandro Bozzon, Ujwal Gadiraju

Conversational agents are being widely adopted across several domains to serve a variety of purposes ranging from providing intelligent assistance to companionship. Recent literature has shown that users develop intuitive folk theories and a metaphorical understanding of conversational agents (CAs) due to the lack of a mental model of the agents. However, investigation of metaphorical agent representation in the HCI community has mainly focused on the human level, despite non-human metaphors for agents being prevalent in the real world. We adopted Lakoff and Turner's 'Great Chain of Being' framework to systematically investigate the impact of using non-human metaphors to represent conversational agents on worker engagement in crowdsourcing marketplaces. We designed a text-based conversational agent that assists crowd workers in task execution. Through a between-subjects experimental study (N = 341), we explored how different human and non-human metaphors affect worker engagement, the perceived cognitive load of workers, intrinsic motivation, and their trust in the agents. Our findings bridge the gap of how users experience CAs with non-human metaphors in the context of conversational crowdsourcing. ...

To Trust or Not To Trust

How a Conversational Interface Affects Trust in a Decision Support System

Conference paper (2022) - Akshit Gupta, Debadeep Basu, Ramya Ghantasala, Sihang Qiu, Ujwal Gadiraju

Trust is an important component of human-AI relationships and plays a major role in shaping the reliance of users on online algorithmic decision support systems. With recent advances in natural language processing, text and voice-based conversational interfaces have provided users with new ways of interacting with such systems. Despite the growing applications of conversational user interfaces (CUIs), little is currently understood about the suitability of such interfaces for decision support and how CUIs inspire trust among humans engaging with decision support systems. In this work, we aim to address this gap and answer the following question: to what extent can a conversational interface build user trust in decision support systems in comparison to a conventional graphical user interface? To this end, we built a text-based conversational interface, and a conventional web-based graphical user interface. These served as the means for users to interact with an online decision support system to help them find housing, given a fixed set of constraints. To understand how the accuracy of the decision support system moderates user behavior and trust across the two interfaces, we considered an accurate and inaccurate system. We carried out a 2 × 2 between-subjects study (N = 240) on the Prolific crowdsourcing platform. Our findings show that the conversational interface was significantly more effective in building user trust and satisfaction in the online housing recommendation system when compared to the conventional web interface. Our results highlight the potential impact of conversational interfaces for trust development in decision support systems. ...

An Analysis of Music Perception Skills on Crowdsourcing Platforms

Journal article (2022) - Ioannis Petros Samiotis, Sihang Qiu, Christoph Lofi, Jie Yang, Ujwal Gadiraju, Alessandro Bozzon

Music content annotation campaigns are common on paid crowdsourcing platforms. Crowd workers are expected to annotate complex music artifacts, a task often demanding specialized skills and expertise, thus selecting the right participants is crucial for campaign success. However, there is a general lack of deeper understanding of the distribution of musical skills, and especially auditory perception skills, in the worker population. To address this knowledge gap, we conducted a user study (N = 200) on Prolific and Amazon Mechanical Turk. We asked crowd workers to indicate their musical sophistication through a questionnaire and assessed their music perception skills through an audio-based skill test. The goal of this work is to better understand the extent to which crowd workers possess higher perceptions skills, beyond their own musical education level and self reported abilities. Our study shows that untrained crowd workers can possess high perception skills on the music elements of melody, tuning, accent, and tempo; skills that can be useful in a plethora of annotation tasks in the music domain. ...

What Should You Know? A Human-In-the-Loop Approach to Unknown Unknowns Characterization in Image Recognition

Conference paper (2022) - Shahin Sharifi Noorian, Sihang Qiu, Ujwal Gadiraju, Jie Yang, Alessandro Bozzon

Unknown unknowns represent a major challenge in reliable image recognition. Existing methods mainly focus on unknown unknowns identification, leveraging human intelligence to gather images that are potentially difficult for the machine. To drive a deeper understanding of unknown unknowns and more effective identification and treatment, this paper focuses on unknown unknowns characterization. We introduce a human-in-the-loop, semantic analysis framework for characterizing unknown unknowns at scale. We engage humans in two tasks that specify what a machine should know and describe what it really knows, respectively, both at the conceptual level, supported by information extraction and machine learning interpretability methods. Data partitioning and sampling techniques are employed to scale out human contributions in handling large data. Through extensive experimentation on scene recognition tasks, we show that our approach provides a rich, descriptive characterization of unknown unknowns and allows for more effective and cost-efficient detection than the state of the art. ...

Improving Reactions to Rejection in Crowdsourcing through Self-Reflection

Conference paper (2021) - Tom Edixhoven, Sihang Qiu, Lucie Kuiper, Olivier Dikken, Gwennan Smitskamp, Ujwal Gadiraju

In popular crowdsourcing marketplaces like Amazon Mechanical Turk, crowd workers complete tasks posted by requesters in return for monetary rewards. Task requesters are solely responsible for deciding whether to accept or reject submitted work. Rejecting work can directly affect the monetary reward of corresponding workers, and indirectly influence worker qualifications and their future work opportunities in the marketplace. Unexpected or unwarranted rejections therefore result in negative emotions and reactions among workers. Despite the high prevalence of rejections in crowdsourcing marketplaces, little research has explored ways to mitigate the negative emotional repercussions of rejections on crowd workers. Addressing this important research gap, we investigate whether introducing self-reflection at different stages after task execution can alleviate the emotional toll of rejection decisions on crowd workers. Our work is inspired by prior studies in psychology that have shown that self-reflection on negative personal experiences can positively affect one's emotion. To this end, we carried out an experimental study investigating the impact of explicit self-reflection on the emotions of rejected crowd workers. Results show that allowing workers to self-reflect on their delivered work, especially before receiving a rejection, has a significantly positive impact on their self-reported emotions in terms of valence and dominance. Our findings reveal that introducing a self-reflection stage before workers receive acceptance or rejection decisions on submitted work, can help in positively influencing the emotions of a worker. These findings have important design implications towards fostering a healthier requester-worker relationship and contributing towards the sustainability of the crowdsourcing marketplace. ...

Using Worker Avatars to Improve Microtask Crowdsourcing

Journal article (2021) - Sihang Qiu, Ujwal Gadiraju, Max V. Birk, Alessandro Bozzon

The future of crowd work has been identified to depend on worker satisfaction, but we lack a thorough understanding of how worker satisfaction can be increased in microtask crowdsourcing. Prior work has shown that one solution is to build tasks that are engaging. To facilitate engagement, two methods that have received attention in recent HCI literature are the use of video games and conversational interfaces. While these are largely different techniques, they aim for the same goal of reducing worker burden and increasing engagement in a task. On one hand, video games have huge motivation potential and translating game design elements for motivational purposes has shown positive effects. Recent work in games research has shown that the use of player avatars is effective in fostering interest, enjoyment, and other aspects pertaining to intrinsic motivation. On the other hand, conversational interfaces have been argued to have advantages over traditional GUIs due to facilitating a more human-like interaction. Conversational microtasking has recently been proposed to improve worker engagement in microtask marketplaces. The contexts of games and crowd work are underlined by the need to motivate and engage participants, yet the potential of using worker avatars to promote self-identification and improve worker satisfaction in microtask crowdsourcing has remained unexplored. Addressing this knowledge gap, we carried out a between-subject study involving 360 crowd workers. We investigated how worker avatars influence quality related outcomes of workers and their perceived experience, in conventional web and novel conversational interfaces. We equipped workers with the functionality of customizing their avatars, and selecting characterizations for their avatars, to understand whether identifying with an avatar can increase the motivation of workers. We found that using worker avatars with conversational interfaces can effectively reduce cognitive workload and increase worker retention. Our results indicate the occurrence of similarity and wishful avatar identification in crowdsourcing. Our findings have important implications in alleviating workers' perceived workload and on the design of crowdsourcing microtasks. ...

The future of crowd work has been identified to depend on worker satisfaction, but we lack a thorough understanding of how worker satisfaction can be increased in microtask crowdsourcing. Prior work has shown that one solution is to build tasks that are engaging. To facilitate engagement, two methods that have received attention in recent HCI literature are the use of video games and conversational interfaces. While these are largely different techniques, they aim for the same goal of reducing worker burden and increasing engagement in a task. On one hand, video games have huge motivation potential and translating game design elements for motivational purposes has shown positive effects. Recent work in games research has shown that the use of player avatars is effective in fostering interest, enjoyment, and other aspects pertaining to intrinsic motivation. On the other hand, conversational interfaces have been argued to have advantages over traditional GUIs due to facilitating a more human-like interaction. Conversational microtasking has recently been proposed to improve worker engagement in microtask marketplaces. The contexts of games and crowd work are underlined by the need to motivate and engage participants, yet the potential of using worker avatars to promote self-identification and improve worker satisfaction in microtask crowdsourcing has remained unexplored. Addressing this knowledge gap, we carried out a between-subject study involving 360 crowd workers. We investigated how worker avatars influence quality related outcomes of workers and their perceived experience, in conventional web and novel conversational interfaces. We equipped workers with the functionality of customizing their avatars, and selecting characterizations for their avatars, to understand whether identifying with an avatar can increase the motivation of workers. We found that using worker avatars with conversational interfaces can effectively reduce cognitive workload and increase worker retention. Our results indicate the occurrence of similarity and wishful avatar identification in crowdsourcing. Our findings have important implications in alleviating workers' perceived workload and on the design of crowdsourcing microtasks.

Conversational Crowdsourcing

Doctoral thesis (2021) - S. Qiu, G.J.P.M. Houben, A. Bozzon, U.K. Gadiraju

Crowdsourcing has become a standard approach for the collection of the human input required by scientists and practitioners alike to execute their experiments, or to train, control, and verify the behavior of their intelligent systems. Despite years of successful research and industrial application, how to improve the engagement and satisfaction of crowd workers with crowdsourcing tasks is still an open research question. In this thesis, we introduce conversational crowdsourcing – a novel crowdsourcing interaction paradigm based on conversational interfaces. We study conversational crowdsourcing, and experimentally evaluate its ability to foster workers’ engagement and satisfaction from four perspectives: conversational crowdsourcing design, improving worker engagement and satisfaction, analyzing the roles of worker mood and self-identification, and applying conversational crowdsourcing for conducting online studies. We describe the design of conversational crowdsourcing and show that conversational crowdsourcing can achieve similar output quality and execution time compared to the traditional web-based crowdsourcing. To facilitate our research, we designed and developed TickTalkTurk, a web application that facilitates the design and development of conversational crowdsourcing tasks on popular crowdsourcing platforms. We demonstrate the feasibility of improving worker engagement and satisfaction and show that conversational crowdsourcing can improve worker retention and perceived engagement that are significantly connected to satisfaction. We present a reliable conversational style estimation method and illustrate that style estimation can be a useful tool for facilitating outcome prediction and task assignment. ...

Exploring the Music Perception Skills of Crowd Workers

Journal article (2021) - I.P. Samiotis, S. Qiu, C. Lofi, J. Yang, Ujwal Gadiraju, Alessandro Bozzon

Music content annotation campaigns are common on paid crowdsourcing platforms. Crowd workers are expected to annotate complicated music artefacts, which can demand certain skills and expertise. Traditional methods of participant selection are not designed to capture these kind of domain-specific skills and expertise, and often domain-specific questions fall under the general demographics category. Despite the popularity of such tasks, there is a general lack of deeper understanding of the distribution of musical properties - especially auditory perception skills - among workers. To address this knowledge gap, we conducted a user study (N=100) on Prolific. We asked workers to indicate their musical sophistication through a questionnaire and assessed their music perception skills through an audio-based skill test. The goal of this work is to better understand the extent to which crowd workers possess higher perceptions skills, beyond their own musical education level and self reported abilities. Our study shows that untrained crowd workers can possess high perception skills on the music elements of melody, tuning, accent and tempo; skills that can be useful in a plethora of annotation tasks in the music domain. ...

Towards Memorable Information Retrieval

Conference paper (2020) - S. Qiu, Ujwal Gadiraju, A. Bozzon

Information overload is a problem many of us can relate to nowadays. The deluge of user generated content on the Internet, and the easy accessibility to a vast amount of data compounds the problem of remembering and retaining information that is consumed. To make information consumed more memorable, strategies such as note-taking have been found to be effective by augmenting human memory under specific conditions. This is based on the rationale that humans tend to recall information better if they have produced the information themselves. Previous works in online education have shown that conversational systems can improve learning effects. Although memorization is an important part of learning, the effect of conversation on human memorability remains unexplored. We aim to address this knowledge gap through an experimental study, by investigating human memorability in a classical information retrieval setup. We explore the impact of note-taking affordances and conversational interfaces on the memorability of information consumed by users. Our results show that traditional web search and note-taking have positive effects on knowledge gain, while the search engine with a conversational interface has the potential to augment long-term memorability. This work highlights the benefits of using note-taking and conversational interfaces to aid human memorability. Our findings have important implications on building information retrieval systems that cater to optimizing memorability of information consumed. ...

Microtask crowdsourcing for music score Transcriptions: an experiment with error detection

Conference paper (2020) - I.P. Samiotis, S. Qiu, A. Mauri, C.C.S. Liem, C. Lofi, A. Bozzon

Human annotation is still an essential part of modern transcription workflows for digitizing music scores, either as a standalone approach where a single expert annotator transcribes a complete score, or for supporting an automated Optical Music Recognition (OMR) system. Research on human computation has shown the effectiveness of crowdsourcing for scaling out human work by defining a large number of microtasks which can easily be distributed and executed. However, microtask design for music transcription is a research area that remains unaddressed. This paper focuses on the design of a crowdsourcing task to detect errors in a score transcription which can be deployed in either automated or human-driven transcription workflows. We conduct an experiment where we study two design parameters: 1) the size of the score to be annotated and 2) the modality in which it is presented in the user interface. We analyze the performance and reliability of non-specialised crowdworkers on Amazon Mechanical Turk with respect to these design parameters, differentiated by worker experience and types of transcription errors. Results are encouraging, and pave the way for scalable and efficient crowdassisted music transcription systems. ...

Detecting, classifying, and mapping retail storefronts using street-level imagery

Conference paper (2020) - Shahin Sharifi Noorian, Sihang Qiu, Achilleas Psyllidis, Alessandro Bozzon, Geert Jan Houben

Up-to-date listings of retail stores and related building functions are challenging and costly to maintain. We introduce a novel method for automatically detecting, geo-locating, and classifying retail stores and related commercial functions, on the basis of storefronts extracted from street-level imagery. Specifically, we present a deep learning approach that takes storefronts from street-level imagery as input, and directly provides the geo-location and type of commercial function as output. Our method showed a recall of 89.05% and a precision of 88.22% on a real-world dataset of street-level images, which experimentally demonstrated that our approach achieves human-level accuracy while having a remarkable run-time efficiency compared to methods such as Faster Region-Convolutional Neural Networks (Faster R-CNN) and Single Shot Detector (SSD). ...

TickTalkTurk

Conversational crowdsourcing made easy

Conference paper (2020) - Sihang Qiu, Ujwal Gadiraju, Alessandro Bozzon

This demo presents TickTalkTurk, a tool that can assist task requesters in quickly deploying crowdsourcing tasks in a customizable conversational worker interface. The conversational worker interface can convey task instructions, deploy microtasks, and gather worker input in a dialogue-based workflow. The interface is implemented as a Web-based application, which makes it compatible with popular crowdsourcing platforms. The tool we developed is demonstrated through two microtask crowdsourcing examples with different task types. Results reveal that our conversational worker interface is capable of better engaging workers and analyzing workers performance. ...

Analyzing Workers Performance in Online Mapping Tasks Across Web, Mobile, and Virtual Reality Platforms

Conference paper (2020) - G.A. van Alphen, S. Qiu, A. Bozzon, G.J.P.M. Houben

In online crowd mapping, crowd workers recruited through crowdsourcing marketplaces collect geographic data. Compared to traditional mapping methods, where workers physically explore the area, the benefit of using online crowd mapping is the potential to be cost-effective and time-efficient. Previous studies have focused on mapping urban objects using street-level imagery. However, they are specifically aimed at a single type of object, and only through web platforms. To the best of our knowledge, there is still a lack of understanding on how workers perform the mapping tasks through different platforms. Aiming to fill this knowledge gap, we investigate the worker performance across web, mobile, and virtual reality platforms by designing a multi-platform system for mapping urban objects using street-level imagery with novel methods for geo-location estimation. We design a preliminary study to show the feasibility of executing online mapping tasks on three platforms. The result demonstrates that the type of task and execution platform can affect the worker performance in terms of worker accuracy, execution time, user engagement, and cognitive load. ...

Improving Worker Engagement Through Conversational Microtask Crowdsourcing

Conference paper (2020) - Sihang Qiu, U.K. Gadiraju, Alessandro Bozzon

The rise in popularity of conversational agents has enabled humans to interact with machines more naturally. Recent work has shown that crowd workers in microtask marketplaces can complete a variety of human intelligence tasks (HITs) using conversational interfaces with similar output quality compared to the traditional Web interfaces. In this paper, we investigate the effectiveness of using conversational interfaces to improve worker engagement in microtask crowdsourcing. We designed a text-based conversational agent that assists workers in task execution, and tested the performance of workers when interacting with agents having different conversational styles. We conducted a rigorous experimental study on Amazon Mechanical Turk with 800 unique workers, to explore whether the output quality, worker engagement and the perceived cognitive load of workers can be affected by the conversational agent and its conversational styles. Our results show that conversational interfaces can be effective in engaging workers, and a suitable conversational style has potential to improve worker engagement. ...

VirtualCrowd

A Simulation Platform for Microtask Crowdsourcing Campaigns

Conference paper (2020) - Sihang Qiu, Alessandro Bozzon, Geert Jan Houben

This demo presents VirtualCrowd, a simulation platform for crowdsourcing campaigns. The platform allows the design, configuration, step-by-step execution, and analysis of customized tasks, worker profiles, and crowdsourcing strategies. The platform will be demonstrated through a crowd-mapping example in two cities, which will highlight the utility of VirtualCrowd for complex crowdsourcing tasks in real world settings. ...

Estimating Conversational Styles in Conversational Microtask Crowdsourcing

Journal article (2020) - Sihang Qiu, Ujwal Gadiraju, Alessandro Bozzon

Crowdsourcing marketplaces have provided a large number of opportunities for online workers to earn a living. To improve satisfaction and engagement of such workers, who are vital for the sustainability of the marketplaces, recent works have used conversational interfaces to support the execution of a variety of crowdsourcing tasks. The rationale behind using conversational interfaces stems from the potential engagement that conversation can stimulate. Prior works in psychology have also shown that ‘conversational styles’ can play an important role in communication. There are unexplored opportunities to estimate a worker’s conversational style with an end goal of improving worker satisfaction, engagement and quality. Addressing this knowledge gap, we investigate the role of conversational styles in conversational microtask crowdsourcing. To this end, we design a conversational interface which supports task execution, and we propose methods to
estimate the conversational style of a worker. Our experimental setup was designed to empirically observe how conversational styles of workers relate with quality-related outcomes. Results show that even a naive supervised classifier can predict the conversation style with high accuracy (80%), and crowd workers with an Involvement conversational style provided a significantly higher output quality, exhibited a higher user engagement and perceived less cognitive task load in comparison to their counterparts. Our findings have important implications on task design with respect to improving worker performance and their engagement in microtask crowdsourcing. ...

Conversational crowdsourcing

Conference paper (2020) - Sihang Qiu, Ujwal Gadiraju, Alessandro Bozzon, Geert Jan Houben

The trend of remote work leads to the prosperity of crowdsourcing marketplaces. In crowdsourcing marketplaces, online workers can select their preferable tasks and then complete them to get paid, while requesters design and publish tasks to acquire their desirable data. The standard user interface of the crowdsourcing task is the web page, where users provide answers using HTML-based web elements, and the task-related information (including instructions and questions) is displayed on a single web page. Although the traditional way of presenting tasks is straightforward, it could negatively affect workers’ satisfaction and performance by causing problems such as boredom and fatigue. To address this challenge, we proposed a novel concept — conversational crowdsourcing, which employs conversational interfaces to facilitate crowdsourcing task execution. With conversational crowdsourcing, workers receive task information as messages from a conversational agent, and provide answers by sending messages back to the agent. In this vision paper, we introduce our recent work in terms of using conversational crowdsourcing to improve worker performance and experience by employing novel human-computer interaction affordances. Our findings reveal that conversational crowdsourcing has important implications in improving the worker satisfaction and requester-worker relationship in crowdsourcing marketplaces. ...

Remote Work Aided by Conversational Agents

Conference paper (2020) - S. Qiu, Ujwal Gadiraju, A. Bozzon

Due to the coronavirus pandemic, remote work from home has rapidly become a necessity around the world, drastically changing the potential landscape for the future of work. Over the last couple of decades, microtask crowdsourcing has emerged as a viable means of carrying out remote online work to earn one’s living — an alternative to traditional work for a large number of people. In the aftermath of the pandemic, there is likely to be an increase in people who need to work from home due to a variety of reasons, ranging from safety and well-being to massive layoffs. However, current crowdsourcing platforms and marketplaces are not adequately optimized for worker satisfaction or engagement. There is a need for a new means of interaction that can engage the workers, support their cognitive needs, cater to their well-being – all without compromising on the quality of work being produced. Drawing inspiration from prior studies which have shown that conversational systems can improve user experiences, we investigate the feasibility of microtask crowdsourcing aided by conversational agents. Findings based on our recent research in conversational microtasking, have important implications on improving both, the subjective mental condition and objective output quality of workers. We believe that conversational agents have an important role to play in shaping how remote work can be carried out in the imminent future. ...