The Interplay Between Interpolation and Aggregation in Regression: Optimal Sample Complexity

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This work investigates the interplay between interpolation and aggregation in regression tasks and its implications for learnability. By introducing the γ-graph dimension, the study characterizes the learnability boundary for a broad class of natural aggregation procedures and proposes a minimalist aggregation method that takes the median of three interpolating hypotheses. Theoretical analysis demonstrates that this median aggregation achieves optimal sample complexity among all finite interpolating aggregations and strictly outperforms standard interpolating learning. Moreover, the work reveals that certain hypothesis classes are learnable only via infinite or non-interpolating aggregations, thereby establishing fundamental limitations and optimality conditions for finite interpolating aggregation schemes.

📝 Abstract

This work investigates theoretically the interplay between interpolation and aggregation in regression. We establish that the $γ$-graph dimension characterizes learnability for a broad class of natural aggregation procedures. Furthermore, we prove that an extremely simple aggregation procedure, combining three interpolating hypotheses via the median, is optimal among all these aggregation procedures, and is strictly more powerful than proper learning. Finally, we show that some hypothesis classes are learnable only by aggregating infinitely many hypotheses or by using non-interpolating aggregation rules (which may predict outside the range of their inputs), and any finite interpolating aggregation fails to achieve even trivial performance.

Problem

Research questions and friction points this paper is trying to address.

interpolation

aggregation

regression

learnability

sample complexity

Innovation

Methods, ideas, or system contributions that make the work stand out.

interpolation

aggregation

γ-graph dimension