Efficient estimation with incomplete data via generalised ANOVA decompositions

When
13-02-2025 from 11:30 to 12:30
Where
Leslokaal 1.2 - Alan Turing, S9, Campus Sterre
Language
English
Organizer
Jan De Neve
Contact
Jan.DeNeve@ugent.be

Quetelet seminar: Efficient estimation with incomplete data via generalised ANOVA decompositions by Dr. Thomas Berrett

Speaker

Dr. Thomas Berrett is an Associate Professor (Reader) in the Department of Statistics at the University of Warwick. His research focuses on developing statistical theory and methodologies, particularly in nonparametric statistics, hypothesis testing, and differential privacy. Before joining Warwick, he held postdoctoral positions at CREST, ENSAE in Paris, and the University of Cambridge.

Abstract

In this talk I will present recent work (https://arxiv.org/abs/2409.05729) on efficient estimation with incomplete data, covering problems arising in semi-supervised learning, data fusion and missing data literatures. Our task is to estimate simple mean functionals given access to a complete dataset that is supplemented by additional incomplete datasets. In particular, we aim to use the incomplete data to reduce the variance of the naive complete-case estimator, and to characterise the minimal asymptotic risk among all estimators. Results of this type exist for monotonic missingness structures, such as those arising in semi-supervised learning and longitudinal studies, but in this work we consider more general settings. We show that the optimal variance can be expressed through the minimal value of a quadratic optimisation problem over a function space, thus establishing a fundamental link between these estimation problems and the theory of generalised ANOVA decompositions. We introduce an estimator that is proved to attain this minimal risk and to be approximately normally distributed, and use this to construct confidence intervals.