Causal inference, particularly the estimation of the Conditional Average Treatment Effects (CATE), is necessary for understanding the impact of interventions beyond simple predictions. This study analyzes the influence of key hyperparameter choices, specifically maximum tree dept
...
Causal inference, particularly the estimation of the Conditional Average Treatment Effects (CATE), is necessary for understanding the impact of interventions beyond simple predictions. This study analyzes the influence of key hyperparameter choices, specifically maximum tree depth and minimum leaf size, on the accuracy and generalization of CATE estimates derived from honest and adaptive causal trees. The research explores how these hyperparameters affect the bias-variance trade-off and the model's tendency to overfit or underfit across various simulated data scenarios and a real-world dataset.
The results reveal that optimal hyperparameter configurations are dependent on the data characteristics, such as dimensionality, noise levels, and the complexity of the true causal effects. Honest causal trees demonstrate a better performance in high-dimensional and noisy environments due to their effective variance control. Conversely, in simpler, low-noise settings or complex CATE structures, adaptive causal trees or baseline models frequently achieve better results by reducing bias. The study also highlights the challenges of using moderately sized datasets, where the sample splitting limitations can lead to higher estimation errors. This work provides thorough suggestions for hyperparameter selection, emphasizing the fact that tuning based on the underlying characteristics of the data is needed for achieving the best CATE estimates possible.