An empirical study of the effects of unconfoundedness on the performance of Propensity Score Matching

Bachelor Thesis (2022)
Author(s)

A. Erdelský (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.H. Krijthe – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

S.R. Bongers – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Rafael Bidarra – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Andrej Erdelský
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Andrej Erdelský
Graduation Date
23-06-2022
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The purpose of this research is to analyze the performance of Propensity Score Matching, a causal inference method for causal effect estimation. More specifically, investigate how Propensity Score Matching reacts to breaking the unconfoundedness assumption, one of its core conceptual pillars. This has been achieved by running PSM on synthetic data that upholds the unconfoundedness condition, and then comparing these results with measurements obtained from running the algorithm on data with confounding features with varying contribution to other variable values and hiding these features individually or in progressively higher numbers. These results are also then compared to Linear Regression, a generic machine learning algorithm, for the sake of comparison of performance. The results obtained point to the observation that when hiding variables that only contribute to the main effect, treatment effect or treatment propensity calculation respectively, PSM performs with the same error no matter which of the three effects the hidden feature affects, making them equivalent in their error contribution. Additionally, it has also become apparent that in all experimental scenarios used in this work, PSM performed very similarly to Linear Regression and did not seem to offer any advantages over the latter in these specific situations.

Files

License info not available