Analytical Correction for Subsampling Bias in Drifting Models

📅 2026-04-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

189K/year
🤖 AI Summary
This work addresses a systematic bias of order O(1/n) in drift models under small-batch training, induced by the self-normalizing property of the softmax function, which compromises centroid estimation accuracy. The authors propose an Analytic Bias Correction (ABC) method that explicitly models the dominant bias term using within-batch statistics and corrects the empirical centroid via a closed-form plug-in estimator. ABC is the first approach to analytically quantify and correct this bias, reducing the error to O(1/n²) without increasing first-order variance or violating convex hull containment. Experiments demonstrate that ABC significantly lowers FID and accelerates convergence on CIFAR-10, with pronounced improvements in small-batch settings; synthetic experiments further validate the theoretical bias order.
📝 Abstract
Drifting models are capable one-step generative models trained to follow a drifting field. The field combines attractive and repulsive softmax-weighted centroids over the data and current-generator distributions. In practice, only a minibatch of $n$ samples from each distribution is available, and each centroid is approximated by an empirical estimate. In this paper, we begin by showing that the minibatch centroid is in general a biased estimator of the target centroid, with a pointwise $O(1/n)$ bias arising from softmax self-normalization. Correcting this bias requires the expectation over the full distribution, which is intractable. We instead approximate the leading bias term from in-batch statistics and propose Analytical Bias Correction (ABC), a closed-form plug-in adjustment. We prove that ABC reduces the bias from $O(1/n)$ to $O(1/n^2)$, introduces no first-order increase in total variance, and preserves convex-hull containment of the corrected centroid. In practice, ABC requires only two additional lines of code and has negligible wall-time overhead under compiled execution. Toy experiments confirm the theoretical $O(1/n)$ and $O(1/n^2)$ scaling. On CIFAR-10, ABC reduces FID and trains faster, with the largest gains at small $n$, where the bias is most significant.
Problem

Research questions and friction points this paper is trying to address.

subsampling bias
drifting models
centroid estimation
softmax normalization
bias correction
Innovation

Methods, ideas, or system contributions that make the work stand out.

subsampling bias
drifting models
analytical bias correction
softmax normalization
centroid estimation