When the Propensity Model Is Wrong

Informal Benchmarking and a False Sense of Robustness in Causal Sensitivity Analysis

Bachelor Thesis (2026)
Author(s)

R. Vízner (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.H. Krijthe – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

M. Havelka – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

A. Anand – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
23-06-2026
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
4
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Causal effect estimates from observational data rely on the assumption that all confounders, variables that influence both treatment and outcome, are observed. Sensitivity analysis with the Marginal Sensitivity Model (MSM) relaxes this assumption through a parameter Γ that bounds how strongly a hidden confounder may distort an individual’s probability of treatment, but choosing a realistic value for Γ is difficult. A common solution, Informal Benchmarking (IB), estimates Γ by removing observed covariates from the propensity model (the model of treatment probability) and measuring the resulting shift. Because IB depends entirely on this model, this paper investigates how IB and the resulting sensitivity bounds behave when the propensity model is misspecified. A controlled simulation study isolates a single functional-form error: a non-linear term that is part of the true treatment mechanism is omitted from the fitted model. Even though the benchmark is computed only on covariates that are individually well specified, the omitted term shrinks every fitted coefficient toward zero, and this leakage deflates the benchmark below the value a correctly specified model reports. The result is falsely robust bounds that understate the true risk of hidden confounding, the more dangerous direction of error, and the effect grows with the strength of the omitted term while standard diagnostics give no warning. A simple safeguard is proposed: refit the propensity model with a richer specification and rerun the benchmark, treating any rise in the estimate as evidence that the original was deflated.

Files

Research_paper_roland.pdf
(pdf | 0.69 Mb)
License info not available