An empirical study of the effects of unconfoundedness on the performance of Propensity Score Matching

More Info
expand_more

Abstract

The purpose of this research is to analyze the performance of Propensity Score Matching, a causal inference method for causal effect estimation. More specifically, investigate how Propensity Score Matching reacts to breaking the unconfoundedness assumption, one of its core conceptual pillars. This has been achieved by running PSM on synthetic data that upholds the unconfoundedness condition, and then comparing these results with measurements obtained from running the algorithm on data with confounding features with varying contribution to other variable values and hiding these features individually or in progressively higher numbers. These results are also then compared to Linear Regression, a generic machine learning algorithm, for the sake of comparison of performance. The results obtained point to the observation that when hiding variables that only contribute to the main effect, treatment effect or treatment propensity calculation respectively, PSM performs with the same error no matter which of the three effects the hidden feature affects, making them equivalent in their error contribution. Additionally, it has also become apparent that in all experimental scenarios used in this work, PSM performed very similarly to Linear Regression and did not seem to offer any advantages over the latter in these specific situations.