T.A. Draws
Please Note
22 records found
1
Nudges to Mitigate Confirmation Bias during Web Search on Debated Topics
Support vs. Manipulation
When people use web search engines to find information on debated topics, the search results they encounter can influence opinion formation and practical decision-making with potentially far-reaching consequences for the individual and society. However, current web search engines lack support for information-seeking strategies that enable responsible opinion formation, e.g., by mitigating confirmation bias and motivating engagement with diverse viewpoints. We conducted two preregistered user studies to test the benefits and risks of an intervention aimed at confirmation bias mitigation. In the first study, we tested the effect of warning labels, warning of the risk of confirmation bias, combined with obfuscations, hiding selected search results per default. We observed that obfuscations with warning labels effectively reduce engagement with search results. These initial findings did not allow conclusions about the extent to which the reduced engagement was caused by the warning label (reflective nudging element) versus the obfuscation (automatic nudging element). If obfuscation was the primary cause, this would raise concerns about harming user autonomy. We thus conducted a follow-up study to test the effect of warning labels and obfuscations separately. According to our findings, obfuscations run the risk of manipulating behavior instead of guiding it, while warning labels without obfuscations (purely reflective) do not exhaust processing capacities but encourage users to actively choose to decrease engagement with attitude-confirming search results. Therefore, given the risks and unclear benefits of obfuscations and potentially other automatic nudging elements to guide engagement with information, we call for prioritizing interventions that aim to enhance human cognitive skills and agency instead.
Adverse phenomena such as the search engine manipulation effect (SEME), where web search users change their attitude on a topic following whatever most highly-ranked search results promote, represent crucial challenges for research and industry. However, the current lack of automatic methods to comprehensively measure or increase viewpoint diversity in search results complicates the understanding and mitigation of such effects. This paper proposes a viewpoint bias metric that evaluates the divergence from a pre-defined scenario of ideal viewpoint diversity considering two essential viewpoint dimensions (i.e., stance and logic of evaluation). In a case study, we apply this metric to actual search results and find considerable viewpoint bias in search results across queries, topics, and search engines that could lead to adverse effects such as SEME. We subsequently demonstrate that viewpoint diversity in search results can be dramatically increased using existing diversification algorithms. The methods proposed in this paper can assist researchers and practitioners in evaluating and improving viewpoint diversity in search results.
Disentangling Fairness Perceptions in Algorithmic Decision-Making
The Effects of Explanations, Human Oversight, and Contestability
Recent research claims that information cues and system attributes of algorithmic decision-making processes affect decision subjects' fairness perceptions. However, little is still known about how these factors interact. This paper presents a user study (N = 267) investigating the individual and combined effects of explanations, human oversight, and contestability on informational and procedural fairness perceptions for high- and low-stakes decisions in a loan approval scenario. We find that explanations and contestability contribute to informational and procedural fairness perceptions, respectively, but we find no evidence for an effect of human oversight. Our results further show that both informational and procedural fairness perceptions contribute positively to overall fairness perceptions but we do not find an interaction effect between them. A qualitative analysis exposes tensions between information overload and understanding, human involvement and timely decision-making, and accounting for personal circumstances while maintaining procedural consistency. Our results have important design implications for algorithmic decision-making processes that meet decision subjects' standards of justice.
Featured snippets that attempt to satisfy users' information needs directly on top of the first search engine results page (SERP) have been shown to strongly impact users' post-search attitudes and beliefs. In the context of debated but scientifically answerable topics, recent research has demonstrated that users tend to trust featured snippets to such an extent that they may reverse their original beliefs based on what such a snippet suggests; even when erroneous information is featured. This paper examines the effect of featured snippets in more nuanced and complicated search scenarios concerning debated topics that have no ground truth and where diverse arguments in favor and against can legitimately be made. We report on a preregistered, online user study (N = 182) investigating how the stances and logics of evaluation (i.e., underlying reasons behind stances) expressed in featured snippets influence post-task attitudes and explanations of users without strong pre-search attitudes. We found that such users tend to not only change their attitudes on debated topics (e.g., school uniforms) following whatever stance a featured snippet expresses but also incorporate the featured snippet's logic of evaluation into their argumentation. Our findings imply that the content displayed in featured snippets may have large-scale undesired consequences for individuals, businesses, and society, and urgently call for researchers and practitioners to examine this issue further.
Social choice aggregation strategies have been proposed as an explainable way to generate recommendations to groups of users. However, it is not trivial to determine the best strategy to apply for a specific group. Previous work highlighted that the performance of a group recommender system is affected by the internal diversity of the group members’ preferences. However, few of them have empirically evaluated how the specific distribution of preferences in a group determines which strategy is the most effective. Furthermore, only a few studies evaluated the impact of providing explanations for the recommendations generated with social choice aggregation strategies, by evaluating explanations and aggregation strategies in a coupled way. To fill these gaps, we present two user studies (N=399 and N=288) examining the effectiveness of social choice aggregation strategies in terms of users’ fairness perception, consensus perception, and satisfaction. We study the impact of the level of (dis-)agreement within the group on the performance of these strategies. Furthermore, we investigate the added value of textual explanations of the underlying social choice aggregation strategy used to generate the recommendation. The results of both user studies show no benefits in using social choice-based explanations for group recommendations. However, we find significant differences in the effectiveness of the social choice-based aggregation strategies in both studies. Furthermore, the specific group configuration (i.e., various scenarios of internal diversity) seems to determine the most effective aggregation strategy. These results provide useful insights on how to select the appropriate aggregation strategy for a specific group based on the level of (dis-)agreement within the group members’ preferences.
One way to help users navigate debated topics online is to apply stance detection in web search. Automatically identifying whether search results are against, neutral, or in favor could facilitate diversification efforts and support interventions that aim to mitigate cognitive biases. To be truly useful in this context, however, stance detection models not only need to make accurate (cross-topic) predictions but also be sufficiently explainable to users when applied to search results - an issue that is currently unclear. This paper presents a study into the feasibility of using current stance detection approaches to assist users in their web search on debated topics. We train and evaluate 10 stance detection models using a stance-annotated data set of 1204 search results. In a preregistered user study (N = 291), we then investigate the quality of stance detection explanations created using different explainability methods and explanation visualization techniques. The models we implement predict stances of search results across topics with satisfying quality (i.e., similar to the state-of-the-art for other data types). However, our results reveal stark differences in explanation quality (i.e., as measured by users' ability to simulate model predictions and their attitudes towards the explanations) between different models and explainability methods. A qualitative analysis of textual user feedback further reveals potential application areas, user concerns, and improvement suggestions for such explanations. Our findings have important implications for the development of user-centered solutions surrounding web search on debated topics.
Due to the increasing amount of information shared online every day, the need for sound and reliable ways of distinguishing between trustworthy and non-trustworthy information is as present as ever. One technique for performing fact-checking at scale is to employ human intelligence in the form of crowd workers. Although earlier work has suggested that crowd workers can reliably identify misinformation, cognitive biases of crowd workers may reduce the quality of truthfulness judgments in this context. We performed a systematic exploratory analysis of publicly available crowdsourced data to identify a set of potential systematic biases that may occur when crowd workers perform fact-checking tasks. Following this exploratory study, we collected a novel data set of crowdsourced truthfulness judgments to validate our hypotheses. Our findings suggest that workers generally overestimate the truthfulness of statements and that different individual characteristics (i.e., their belief in science) and cognitive biases (i.e., the affect heuristic and overconfidence) can affect their annotations. Interestingly, we find that, depending on the general judgment tendencies of workers, their biases may sometimes lead to more accurate judgments.
Combine Statistical Thinking With Open Scientific Practice
A Protocol of a Bayesian Research Project
Current developments in the statistics community suggest that modern statistics education should be structured holistically, that is, by allowing students to work with real data and to answer concrete statistical questions, but also by educating them about alternative frameworks, such as Bayesian inference. In this article, we describe how we incorporated such a holistic structure in a Bayesian research project on ordered binomial probabilities. The project was conducted with a group of three undergraduate psychology students who had basic knowledge of Bayesian statistics and programming, but lacked formal mathematical training. The research project aimed to (1) convey the basic mathematical concepts of Bayesian inference; (2) have students experience the entire empirical cycle including collection, analysis, and interpretation of data and (3) teach students open science practices.
Research in the area of human information interaction (HII) typically represents viewpoints on debated topics in a binary fashion, as either against or in favor of a given topic (e.g., the feminist movement). This simple taxonomy, however, greatly reduces the latent richness of viewpoints and thereby limits the potential of research and practical applications in this field. Work in the communication sciences has already demonstrated that viewpoints can be represented in much more comprehensive ways, which could enable a deeper understanding of users' interactions with debated topics online. For instance, a viewpoint's stance usually has a degree of strength (e.g., mild or strong), and, even if two viewpoints support or oppose something to the same degree, they may use different logics of evaluation (i.e., underlying reasons). In this paper, we draw from communication science practice to propose a novel, two-dimensional way of representing viewpoints that incorporates a viewpoint's stance degree as well as its logic of evaluation. We show in a case study of tweets on debated topics how our proposed viewpoint label can be obtained via crowdsourcing with acceptable reliability. By analyzing the resulting data set and conducting a user study, we further show that the two-dimensional viewpoint representation we propose allows for more meaningful analyses and diversification interventions compared to current approaches. Finally, we discuss what this novel viewpoint label implies for HII research and how obtaining it may be made cheaper in the future.
The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset ((Formula presented.) participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported (Formula presented.)). For the second research question, this was the case for 65% of the teams (median reported (Formula presented.)). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates.
Systems aiming to aid consumers in their decision-making (e.g., by implementing persuasive techniques) are more likely to be effective when consumers trust them. However, recent research has demonstrated that the machine learning algorithms that often underlie such technology can act unfairly towards specific groups (e.g., by making more favorable predictions for men than for women). An undesired disparate impact resulting from this kind of algorithmic unfairness could diminish consumer trust and thereby undermine the purpose of the system. We studied this effect by conducting a between-subjects user study investigating how (gender-related) disparate impact affected consumer trust in an app designed to improve consumers’ financial decision-making. Our results show that disparate impact decreased consumers’ trust in the system and made them less likely to use it. Moreover, we find that trust was affected to the same degree across consumer groups (i.e., advantaged and disadvantaged users) despite both of these consumer groups recognizing their respective levels of personal benefit. Our findings highlight the importance of fairness in consumer-oriented artificial intelligence systems.
Helping users discover perspectives
Enhancing opinion mining with joint topic models
Transparency Paths
Documenting the Diversity of User Perceptions
We are living in an era of global digital platforms, eco-systems of algorithmic processes that serve users worldwide. However, the increasing exposure to diversity online - of information and users - has led to important considerations of bias. A given platform, such as the Google search engine, may demonstrate behaviors that deviate from what users expect, or what they consider fair, relative to their own context and experiences. In this exploratory work, we put forward the notion of transparency paths, a process by which we document our position, choices, and perceptions when developing and/or using algorithmic platforms. We conducted a self-reflection exercise with seven researchers, who collected and analyzed two sets of images; one depicting an everyday activity, "washing hands,"and a second depicting the concept of "home."Participants had to document their process and choices, and in the end, compare their work to others. Finally, participants were asked to reflect on the definitions of bias and diversity. The exercise revealed the range of perspectives and approaches taken, underscoring the need for future work that will refine the transparency paths methodology.
This Item Might Reinforce Your Opinion
Obfuscation and Labeling of Search Results to Mitigate Confirmation Bias
During online information search, users tend to select search results that confirm previous beliefs and ignore competing possibilities. This systematic pattern in human behavior is known as confirmation bias. In this paper, we study the effect of obfuscation (i.e., hiding the result unless the user clicks on it) with warning labels and the effect of task on interaction with attitude-confirming search results. We conducted a preregistered, between-subjects crowdsourced user study (N=328) comparing six groups: Three levels of obfuscation (targeted, random, none) and two levels of task (joint, two separate) for four debated topics. We found that both types of obfuscation influence user interactions, and in particular that targeted obfuscation helps decrease interaction with attitude-confirming search results. Future work is needed to understand how much of the observed effect is due to the strong influence of obfuscation, versus the warning label or the task design. We discuss design guidelines concerning system goals such as decreasing consumption of attitude-confirming search results, versus nudging users toward a more analytical mode of information processing. We also discuss implications for future work, such as the effects of interventions for confirmation bias mitigation over repeated exposure. We conclude with a strong word of caution: measures such as obfuscations should only be used for the benefit of the user, e.g., when they explicitly consent to mitigating their own biases.
In web search on debated topics, algorithmic and cognitive biases strongly influence how users consume and process information. Recent research has shown that this can lead to a search engine manipulation effect (SEME): when search result rankings are biased towards a particular viewpoint, users tend to adopt this favored viewpoint. To better understand the mechanisms underlying SEME, we present a pre-registered, 5 x 3 factorial user study investigating whether order effects (i.e., users adopting the viewpoint pertaining to higher-ranked documents) can cause SEME. For five different debated topics, we evaluated attitude change after exposing participants with mild pre-existing attitudes to search results that were overall viewpoint-balanced but reflected one of three levels of algorithmic ranking bias. We found that attitude change did not differ across levels of ranking bias and did not vary based on individual user differences. Our results thus suggest that order effects may not be an underlying mechanism of SEME. Exploratory analyses lend support to the presence of exposure effects (i.e., users adopting the majority viewpoint among the results they examine) as a contributing factor to users' attitude change. We discuss how our findings can inform the design of user bias mitigation strategies.