Implicit Models: Expressive Power Scales with Test-Time Compute

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

While empirical studies show that implicit models—using small architectures with increased inference-time iterations—can match the performance of large explicit networks, no theoretical explanation exists for how their expressive power scales with computational budget during inference. Method: This work provides the first nonparametric analysis establishing that implicit models, via parameter-sharing fixed-point iterations, asymptotically approximate increasingly complex function classes, with expressivity monotonically increasing in the number of inference iterations. We integrate tools from implicit differentiation, fixed-point theory, and nonparametric statistics. Results: We validate our theory across diverse tasks—including image reconstruction, scientific computing, and operations research—demonstrating simultaneous improvements in solution quality and effective model capacity. Our framework enables memory-constant training and dynamically controllable inference, offering both a rigorous theoretical foundation and a practical paradigm for efficient implicit modeling.

Technology Category

Application Category

📝 Abstract

Implicit models, an emerging model class, compute outputs by iterating a single parameter block to a fixed point. This architecture realizes an infinite-depth, weight-tied network that trains with constant memory, significantly reducing memory needs for the same level of performance compared to explicit models. While it is empirically known that these compact models can often match or even exceed larger explicit networks by allocating more test-time compute, the underlying mechanism remains poorly understood. We study this gap through a nonparametric analysis of expressive power. We provide a strict mathematical characterization, showing that a simple and regular implicit operator can, through iteration, progressively express more complex mappings. We prove that for a broad class of implicit models, this process lets the model's expressive power scale with test-time compute, ultimately matching a much richer function class. The theory is validated across three domains: image reconstruction, scientific computing, and operations research, demonstrating that as test-time iterations increase, the complexity of the learned mapping rises, while the solution quality simultaneously improves and stabilizes.

Problem

Research questions and friction points this paper is trying to address.

Implicit models achieve expressive power scaling with test-time compute iterations

Mathematical characterization shows iteration enables expression of complex mappings

Expressive power growth validated across image reconstruction and scientific computing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit models iterate parameters to fixed point

Expressive power scales with test-time compute

Memory-efficient infinite-depth weight-tied architecture

🔎 Similar Papers

Emergence of a High-Dimensional Abstraction Phase in Language Transformers