Scaling Non-Parametric Sampling with Representation

📅 2025-10-25

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

Contemporary image generation models achieve high visual fidelity but rely on complex parametric architectures and extensive training, resulting in opaque generative mechanisms. Method: This paper proposes a zero-shot, fully non-parametric generative framework that exploits inherent properties of natural images—spatial non-stationarity, low-level regularity, and high-level semantic structure—defining pixel-wise conditional distributions solely via local contextual windows, enabling interpretable, pixel-level sampling without optimization. Contribution/Results: The approach uncovers a “part-to-whole” generalization principle, offering a minimal theoretical account of natural image structure. It generates visually realistic samples on MNIST and CIFAR-10, with fully traceable and reproducible inference. Crucially, it achieves, for the first time, simultaneous high-fidelity synthesis and mechanistic interpretability—bridging generative performance with analytical transparency—and substantially advances the explainability and theoretical tractability of generative models.

Technology Category

Application Category

📝 Abstract

Scaling and architectural advances have produced strikingly photorealistic image generative models, yet their mechanisms still remain opaque. Rather than advancing scaling, our goal is to strip away complicated engineering tricks and propose a simple, non-parametric generative model. Our design is grounded in three principles of natural images-(i) spatial non-stationarity, (ii) low-level regularities, and (iii) high-level semantics-and defines each pixel's distribution from its local context window. Despite its minimal architecture and no training, the model produces high-fidelity samples on MNIST and visually compelling CIFAR-10 images. This combination of simplicity and strong empirical performance points toward a minimal theory of natural-image structure. The model's white-box nature also allows us to have a mechanistic understanding of how the model generalizes and generates diverse images. We study it by tracing each generated pixel back to its source images. These analyses reveal a simple, compositional procedure for "part-whole generalization", suggesting a hypothesis for how large neural network generative models learn to generalize.

Problem

Research questions and friction points this paper is trying to address.

Proposing a simple non-parametric generative image model

Understanding mechanisms behind photorealistic image generation

Studying model generalization through compositional pixel tracing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-parametric generative model using local context

Pixel distribution defined from spatial context windows

White-box mechanism enabling compositional part-whole generalization

🔎 Similar Papers

Gradient-free score-based sampling methods with ensembles