Skip to content Skip to navigation

Eric J. Tchetgen Tchetgen - Model Selection for Machine Learning Estimation of Doubly Robust Functionals

Eric J. Tchetgen Tchetgen headshot
December 2, 2019 - 1:10pm
Graduate School of Business, Gunn Building, Rm G101

Information regarding parking:



Talk title: Model Selection for Machine Learning Estimation of Doubly Robust Functionals

Abstract: While model selection is a well-studied topic in parametric and nonparametric regression and density estimation, model selection of possibly high dimensional
nuisance parameters in semiparametric problems is far less developed. This paper proposes a new model selection framework for making inferences about a finite dimensional
functional defined on a semiparametric model, when the latter admits a doubly robust estimating function. The class of such doubly robust functionals is quite large, and includes estimation of pathwise differentiable functionals when data are missing at random and in causal inference problems under unconfoundedness conditions. Under double robustness, the estimated functional should incur no biasif either of two nuisance parameters is evaluated at the truth while the other spans a large collection of possibly incorrect candidate models. Our approach introduces a novel
minimax pseudo-risk criterion for the functional of primary interest that embodies this double robustness property and thus may be used to select the candidate model that is nearest to fulfilling this property even when all models are wrong. We establish an oracle property for a multi-fold cross-validation scheme of the new model selection criterion which states that our empirical criterion performs nearly as well as that of an oracle with a priori knowledge of the pseudo-risk for each candidate model. We also describe a smooth approximation to the selection criterion which allows for valid post-selection inference. Finally, we apply the approach to perform model selection of a semiparametric estimator of average treatment effect given an ensemble of candidate machine learning methods to account for confounding in a study of right heart catheterization in the initial care unit of critically ill patients. This is joint work with Yifan Cui.



Eric J Tchetgen Tchetgen is a professor of Statistics at the Wharton School of the University of Pennslyvania. His research interests include: Semiparametric theory, nonparametric statistics, causal inference, missing data, and epidemiologic methods. Eric J Tchetgen Tchetgen received his Bachelors from Yale and his Ph.D. from Harvard. 

Event Sponsor: 
Institute for Research in the Social Sciences and Graduate School of Business

This event belongs to the following series