Justin Young

2024–25 Dissertation Fellowship
Recent developments in causal machine learning methods have made it easier to estimate flexible relationships between confounders and treatments and outcomes, making unconfoundedness assumptions in causal analysis more palatable. How successful are these approaches in recovering ground truth baselines? In this paper we analyze a new data sample encompassing an experimental rollout of a new feature at a large technology company and a simultaneous sample of users who endogenously opted into the feature. We find that it is possible to recover ground truth causal effects, but only with careful choices in modeling. This extends many of the findings in the econometric literature stretching back to Lalonde (1986), putting forth best practices that allow for more credible treatment effect estimation in modern, high-dimensional datasets.