DIET-CP: Lightweight and Data Efficient Self Supervised Continued Pretraining

📅 2025-09-02

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

In professional-domain few-shot scenarios, self-supervised continual pretraining faces critical bottlenecks—severe data scarcity, infeasibility of hyperparameter tuning, and absence of publicly available backbone models containing necessary information for further training. Method: We propose DIET-CP, a lightweight, cross-modal and cross-architecture continual pretraining method requiring only a minimal amount (e.g., 1,000 samples) of unlabeled data, no additional hyperparameters, and solely the original backbone weights. Its core is an ultra-simple, unsupervised training objective designed to jointly ensure stability and efficiency. Contribution/Results: Experiments demonstrate that DIET-CP significantly enhances the adaptability of state-of-the-art vision foundation models (e.g., DINOv3) under extremely limited data distributions, effectively overcoming feasibility and efficiency barriers in continual learning under data-scarce regimes.

Technology Category

Application Category

📝 Abstract

Continued pretraining offers a promising solution for adapting foundation models to a new target domain. However, in specialized domains, available datasets are often very small, limiting the applicability of SSL methods developed for large-scale pretraining and making hyperparameter search infeasible. In addition, pretrained models are usually released as backbone-weights only, lacking important information to continue pretraining. We propose to bridge this gap with DIET-CP, a simple continued pretraining strategy, where any strong foundation model can be steered towards the new data distribution of interest. DIET-CP relies on a very simple objective, requires no labels, and introduces no more hyperparameters than supervised finetuning. It is stable across data modalities and backbone choices, while providing a significant performance boost for state-of-the-art models such as DINOv3 using only 1000 images.

Problem

Research questions and friction points this paper is trying to address.

Adapting foundation models to small target domains

Overcoming limited data in specialized domain pretraining

Eliminating hyperparameter search for continued pretraining

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight continued pretraining strategy

No labels or extra hyperparameters required

Stable across data modalities and backbones

🔎 Similar Papers

HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning