Separating Geometry from Probability in the Analysis of Generalization

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work addresses the limitations of traditional generalization analyses, which rely on the often unverifiable assumption of independent and identically distributed (i.i.d.) data and thus struggle to accurately characterize model performance on unseen data. The paper proposes a deterministic generalization analysis framework that dispenses with any prior probabilistic assumptions. By examining the sensitivity of optimization solutions to data perturbations, it decomposes the generalization error into geometric and probabilistic components, achieving their first-ever decoupling. The framework expresses generalization bounds via a variational principle, leveraging deterministic perturbation analysis and optimization sensitivity theory to capture the discrepancy between in-sample and out-of-sample performance. Error terms are evaluated through posterior statistical hypotheses, enabling the recovery of conventional high-probability or expected generalization guarantees—all without requiring distributional assumptions.

Technology Category

Application Category

📝 Abstract

The goal of machine learning is to find models that minimize prediction error on data that has not yet been seen. Its operational paradigm assumes access to a dataset $S$ and articulates a scheme for evaluating how well a given model performs on an arbitrary sample. The sample can be $S$ (in which case we speak of ``in-sample'' performance) or some entirely new $S'$ (in which case we speak of ``out-of-sample'' performance). Traditional analysis of generalization assumes that both in- and out-of-sample data are i.i.d.\ draws from an infinite population. However, these probabilistic assumptions cannot be verified even in principle. This paper presents an alternative view of generalization through the lens of sensitivity analysis of solutions of optimization problems to perturbations in the problem data. Under this framework, generalization bounds are obtained by purely deterministic means and take the form of variational principles that relate in-sample and out-of-sample evaluations through an error term that quantifies how close out-of-sample data are to in-sample data. Statistical assumptions can then be used \textit{ex post} to characterize the situations when this error term is small (either on average or with high probability).

Problem

Research questions and friction points this paper is trying to address.

generalization

probabilistic assumptions

i.i.d.

sensitivity analysis

optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

generalization

sensitivity analysis

optimization perturbations