🤖 AI Summary
Existing decoupled dataset distillation methods suffer from fitting complexity imbalance and anchor attraction effects due to their reliance on static real image patches, which compromises intra-class diversity and generalization. To address this, this work proposes the RETA framework, which introduces Dynamic Retrieval Connections (DRC) to adaptively select optimal real image patches for injection into residual pathways, thereby balancing model fitting capacity and complexity. Additionally, RETA incorporates topological alignment regularization based on persistent homology, aligning the topological structures of synthetic and real data through k-NN feature maps and persistence images. The method achieves state-of-the-art performance across CIFAR-100, Tiny-ImageNet, and ImageNet-1K; notably, on ImageNet-1K with 50 images per class, it attains a top-1 accuracy of 64.3% using ResNet-18, surpassing the best baseline by 3.1%.
📝 Abstract
Decoupled dataset distillation (DD) compresses large corpora into a few synthetic images by matching a frozen teacher's statistics. However, current residual-matching pipelines rely on static real patches, creating a fit-complexity gap and a pull-to-anchor effect that reduce intra-class diversity and hurt generalization. To address these issues, we introduce RETA -- a Retrieval and Topology Alignment framework for decoupled DD. First, Dynamic Retrieval Connection (DRC) selects a real patch from a prebuilt pool by minimizing a fit-complexity score in teacher feature space; the chosen patch is injected via a residual connection to tighten feature fit while controlling injected complexity. Second, Persistent Topology Alignment (PTA) regularizes synthesis with persistent homology: we build a mutual k-NN feature graph, compute persistence images of components and loops, and penalize topology discrepancies between real and synthetic sets, mitigating pull-to-anchor effect. Across CIFAR-100, Tiny-ImageNet, ImageNet-1K, and multiple ImageNet subsets, RETA consistently outperforms various baselines under comparable time and memory, especially reaching 64.3% top-1 accuracy on ImageNet-1K with ResNet-18 at 50 images per class, +3.1% over the best prior.