Neural Coherence : Find higher performance to out-of-distribution tasks from few samples

📅 2025-12-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses transfer learning under few-shot, unlabeled, and out-of-distribution (OOD) settings. We propose a pre-trained model checkpoint selection method grounded in **neural consistency**—a measure of statistical alignment between source and target domain activations across network layers. Leveraging only a small number of unlabeled target samples, our method automatically identifies the checkpoint with optimal generalization performance. Unlike existing approaches that rely on fine-tuning or pseudo-labeling, ours requires no target-domain annotations, incurs zero additional training cost, and is inherently robust to distributional shift. Evaluated on ImageNet1K-pretrained models, it significantly outperforms state-of-the-art checkpoint selection baselines on OOD benchmarks including Food-101, PlantNet-300K, and iNaturalist. Furthermore, we validate its broad utility in meta-learning initialization and hard example mining, demonstrating consistent generalization gains across diverse downstream tasks.

Technology Category

Application Category

📝 Abstract
To create state-of-the-art models for many downstream tasks, it has become common practice to fine-tune a pre-trained large vision model. However, it remains an open question of how to best determine which of the many possible model checkpoints resulting from a large training run to use as the starting point. This becomes especially important when data for the target task of interest is scarce, unlabeled and out-of-distribution. In such scenarios, common methods relying on in-distribution validation data become unreliable or inapplicable. This work proposes a novel approach for model selection that operates reliably on just a few unlabeled examples from the target task. Our approach is based on a novel concept: Neural Coherence, which entails characterizing a model's activation statistics for source and target domains, allowing one to define model selection methods with high data-efficiency. We provide experiments where models are pre-trained on ImageNet1K and examine target domains consisting of Food-101, PlantNet-300K and iNaturalist. We also evaluate it in many meta-learning settings. Our approach significantly improves generalization across these different target domains compared to established baselines. We further demonstrate the versatility of Neural Coherence as a powerful principle by showing its effectiveness in training data selection.
Problem

Research questions and friction points this paper is trying to address.

Selecting optimal model checkpoints without labeled target data
Addressing out-of-distribution tasks with scarce unlabeled samples
Improving generalization across domains using activation statistics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model selection using Neural Coherence concept
Characterizing activation statistics across source and target domains
Operating reliably with few unlabeled out-of-distribution examples
🔎 Similar Papers
No similar papers found.