Crowdsourcing Hypothesis Tests

Making transparent how design choices shape research results

Journal Article (2020)
Author(s)

Justin F. Landy (Nova Southeastern University)

Miaolei Liam (University of Warwick, Warwick Business School)

Isabel L. Ding (National University of Singapore)

Domenico Viganola (George Mason University)

Warren Tierney (Mary Immaculate College, Kemmy Business School)

Anna Dreber (Stockholm School of Economics, University of Innsbruck)

Magnus Johannesson (Stockholm School of Economics)

Thomas Pfeiffer (Massey University)

Charles R. Ebersole (University of Kassel, University of Virginia)

Quentin F. Gronau (Universiteit van Amsterdam)

Alexander Ly (Universiteit van Amsterdam)

Don Van Den Bergh (Max Planck Institute for Human Development, Universiteit van Amsterdam)

Maarten Marsman (University of Roehampton, Universiteit van Amsterdam)

Koen Derks (Nyenrode Business Universiteit, University of Roehampton)

Eric Jan Wagenmaker (Nova Southeastern University, Pacific Lutheran University)

Andrew Proctor (Universität Bremen, Stockholm School of Economics)

Daniel M. Bartels (University of Chicago)

Christopher W. Bauman (University of California)

William J. Brady (New York University)

Felix Cheung (The University of Hong Kong)

Andrei Cimpian (Universität zu Köln, New York University)

Simone Dohle (Michigan State University, Universität zu Köln)

M. Brent Donnellan (Michigan State University, Universitetet i Oslo)

Adam Hahn (University of Michigan, Universität zu Köln)

Michael P. Hall (University of Michigan)

William Jiménez-Leal (Universidad de los Andes, University of Andes)

David J. Johnson (University of Maryland, Department of Sociology)

Richard E. Lucas (Harz University of Applied Sciences and University of Bamberg, Michigan State University)

BenoÎt Monin (Stanford University)

Andres Montealegre (Universidad de los Andes, University of Kassel)

Elizabeth Mullen (San José State University)

Jun Pang (Renmin Business School, Renmin University of China)

Jennifer Ray (New York University, Renmin Business School)

Diego A. Reinero (Western Kentucky University, New York University)

Jesse Reynolds (Stanford University, Abilene Christian University)

Walter Sowden (University of Michigan)

Daniel Storage (University of Denver)

Runkun Su (Duke University, NUS Business School)

Christina M. Tworek (HarrisX)

Jay J. Van Bavel (New York University, Masaryk University)

Daniel Walco (New York Yankees)

Julian Wills (National University of Singapore, New York University)

Xiaobing Xu (Hainan University, Tsinghua University)

Kai Chi Yam (Rotman School of Management, Catholic University of the Sacred Heart)

Xiaoyu Yang (Rotman School of Management, Tsinghua University)

Yen Ping Chang (Institute of Sociology)

Tina S.T. Huang (Instituto Universitário de Lisboa (ISCTE-IUL))

Samuel G.B. Johnson (University of Bath)

Oscar Oviedo-Trespalacios (Queensland University of Technology)

Yoo Jin Lee (Insead)

Affiliation
External organisation
DOI related publication
https://doi.org/10.1037/bul0000220 Final published version
More Info
expand_more
Publication Year
2020
Language
English
Affiliation
External organisation
Journal title
Psychological Bulletin
Issue number
5
Volume number
146
Pages (from-to)
451-479
Downloads counter
370

Abstract

To what extent are research results influenced by subjective decisions that scientists make as they design studies? Fifteen research teams independently designed studies to answer five original research questions related to moral judgments, negotiations, and implicit cognition. Participants from 2 separate large samples (total N = 15,000) were then randomly assigned to complete 1 version of each study. Effect sizes varied dramatically across different sets of materials designed to test the same hypothesis: Materials from different teams rendered statistically significant effects in opposite directions for 4 of 5 hypotheses, with the narrowest range in estimates being d = —0.37 to + 0.26. Meta-analysis and a Bayesian perspective on the results revealed overall support for 2 hypotheses and a lack of support for 3 hypotheses. Overall, practically none of the variability in effect sizes was attributable to the skill of the research team in designing materials, whereas considerable variability was attributable to the hypothesis being tested. In a forecasting survey, predictions of other scientists were significantly correlated with study results, both across and within hypotheses. Crowdsourced testing of research hypotheses helps reveal the true consistency of empirical support for a scientific claim.