🤖 AI Summary
This work investigates the fundamental trade-off between stability and accuracy in statistical estimation by formulating stability as a constraint within the framework of statistical decision theory. It systematically analyzes how worst-case and average-case stability requirements affect estimation accuracy, employing minimax analysis and constrained optimization to construct optimal stable estimators for canonical problems such as mean estimation and regression. The key contribution lies in establishing, for the first time within a unified framework, lower bounds on estimation accuracy under both notions of stability, revealing that average-case stability imposes a strictly weaker constraint than worst-case stability, with the gap depending on the specific estimation task. Furthermore, the paper precisely characterizes the optimal stability–accuracy trade-offs in four representative estimation settings, quantifying the statistical cost incurred by different stability mechanisms.
📝 Abstract
Algorithmic stability is a central concept in statistics and learning theory that measures how sensitive an algorithm's output is to small changes in the training data. Stability plays a crucial role in understanding generalization, robustness, and replicability, and a variety of stability notions have been proposed in different learning settings. However, while stability entails desirable properties, it is typically not sufficient on its own for statistical learning -- and indeed, it may be at odds with accuracy, since an algorithm that always outputs a constant function is perfectly stable but statistically meaningless. Thus, it is essential to understand the potential statistical cost of stability. In this work, we address this question by adopting a statistical decision-theoretic perspective, treating stability as a constraint in estimation. Focusing on two representative notions-worst-case stability and average-case stability-we first establish general lower bounds on the achievable estimation accuracy under each type of stability constraint. We then develop optimal stable estimators for four canonical estimation problems, including several mean estimation and regression settings. Together, these results characterize the optimal trade-offs between stability and accuracy across these tasks. Our findings formalize the intuition that average-case stability imposes a qualitatively weaker restriction than worst-case stability, and they further reveal that the gap between these two can vary substantially across different estimation problems.