Expert forecasting with and without uncertainty quantification and weighting

What do the data say?

Journal Article (2020)
Author(s)

Roger M. Cooke (Resources for the Future, TU Delft - Applied Probability)

Deniz Marti (The George Washington University)

Thomas Mazzuchi (The George Washington University)

Research Group
Applied Probability
Copyright
© 2020 R.M. Cooke, Deniz Marti, Thomas Mazzuchi
DOI related publication
https://doi.org/10.1016/j.ijforecast.2020.06.007
More Info
expand_more
Publication Year
2020
Language
English
Copyright
© 2020 R.M. Cooke, Deniz Marti, Thomas Mazzuchi
Research Group
Applied Probability
Issue number
1
Volume number
37
Pages (from-to)
378-387
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Post-2006 expert judgment data has been extended to 530 experts assessing 580 calibration variables from their fields. New analysis shows that point predictions as medians of combined expert distributions outperform combined medians, and medians of performance weighted combinations outperform medians of equal weighted combinations. Relative to the equal weight combination of medians, using the medians of performance weighted combinations yields a 65% improvement. Using the medians of equally weighted combinations yields a 46% improvement. The Random Expert Hypothesis underlying all performance-blind combination schemes, namely that differences in expert performance reflect random stressors and not persistent properties of the experts, is tested by randomly scrambling expert panels. Generating distributions for a full set of performance metrics, the hypotheses that the original panels’ performance measures are drawn from distributions produced by random scrambling are rejected at significance levels ranging from E−6 to E−12. Random stressors cannot produce the variations in performance seen in the original panels. In- and out-of-sample validation results are updated.