Estimating the effect of an intervention on an outcome is a central challenge across science and society. In medicine, we may ask whether a drug effectively treats a disease, and in economics, whether a new policy reduces unemployment. Estimating such effects from data, a process
...
Estimating the effect of an intervention on an outcome is a central challenge across science and society. In medicine, we may ask whether a drug effectively treats a disease, and in economics, whether a new policy reduces unemployment. Estimating such effects from data, a process known as causal inference, is essential but inherently difficult because it often relies on untestable assumptions to ensure unbiased identification of treatment effects. A key example of such an untestable assumption is the absence of unmeasured confounding, meaning that no hidden variable influences both the treatment and the outcome. When this assumption fails, something which we cannot directly verify, treatment effect estimates may become biased. This ultimately can lead to untrustworthy conclusions and, in the worst case, unsafe decisions, such as prescribing the wrong drug to a patient. The central question of this dissertation is therefore whether we can develop methods for safer causal inference that either detect violations of its underlying assumptions or remain robust when those assumptions are violated.
In Part One, we address the first aspect of detecting violations of causal identification assumptions. We focus on settings with data from multiple sources, such as hospitals or locations, where distributional shifts naturally occur. Under specific independence conditions on the causal mechanisms driving these shifts, we first present a nonparametric test to falsify the assumption of no unmeasured confounding. To obtain these results, we introduce a novel technique utilizing hierarchical causal graphical models. Thereafter, we focus on improving the statistical efficiency of this test, which is achieved by reformulating the independence condition using parameterized linear models. Finally, we extend the hierarchical modeling approach to other identification settings, specifically by testing the validity of mediators and instrumental variables used in two additional common identification strategies.
In Parts Two and Three, we develop methods that instead are robust when causal identification assumptions are violated.We revisit two commonly occurring problem settings when doing causal inference and demonstrate that it is possible to develop methods that either remove the need for, or rely on, weaker and more plausible assumptions than those traditionally made. In the first setting, we study the problemof augmenting randomized trials using external data to improve efficiency in treatment effect estimation. Typically, such approaches rely on a transportability assumption that relate the populations underlying the trial and external data. But when this transportability assumption is violated, integrating external data can introduce substantial bias. To address this, we propose a novel and efficient estimator that incorporates external data and show that this estimator improves inference on the average treatment effect while guaranteeing that it never performs worse, and sometimes performs better, than the estimator that relies solely on trial data.We further adapt this estimator to learn heterogeneous treatment effects within the trial population and show that similar safety guarantees hold for this problem.
In the second setting, we examine the evaluation of treatment allocation strategies using Qini curves. Standard methods for estimating Qini curves assume no interference between treated units, meaning that the treatment of one unit does not affect others. However, when interference is present, these Qini curves can be misleading and lead to incorrect evaluation of treatment allocation strategies.We therefore propose multiple estimators to handle the interference, specifically in settings where units within a cluster may affect one another but not units in other clusters.We identify a bias-variance trade-off in these estimators and, through both theoretical and empirical results, provide practical guidance on how practitioners can choose among them. The dissertation concludes with a discussion of broader considerations, limitations of the presented research, and potential directions for future work.We find that it is indeed possible to make causal inference safer by detecting assumption violations and reducing reliance on untestable assumptions. Nonetheless, many open and important questions remain, offering promising avenues for further research on this topic.