Foundation Model Insights and a Multi-Model Approach for Superior Fine-Grained One-shot Subset Selection

📅 2025-06-17

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

This work addresses the limitation of conventional information extractors (IEs) in one-shot subset selection for fine-grained image datasets—namely, their reliance on target-domain pretraining and poor generalization. We propose a foundation model (FM)-based approach that eliminates this dependency. For the first time, we systematically demonstrate that FMs significantly outperform IEs in fine-grained settings. To harness complementary FM capabilities, we introduce RAM-APL, a multi-FM collaborative ranking framework: it generates pseudo-class labels via ensemble inference and fuses individual model rankings using mean-accuracy-weighted aggregation. Evaluated on Oxford-IIIT Pets, Food-101, and CUB-200-2011, our method achieves state-of-the-art performance in subset selection, substantially improving both training efficiency and final accuracy of downstream models—thereby breaking the long-standing reliance on task-specific pretraining.

Technology Category

Application Category

📝 Abstract

One-shot subset selection serves as an effective tool to reduce deep learning training costs by identifying an informative data subset based on the information extracted by an information extractor (IE). Traditional IEs, typically pre-trained on the target dataset, are inherently dataset-dependent. Foundation models (FMs) offer a promising alternative, potentially mitigating this limitation. This work investigates two key questions: (1) Can FM-based subset selection outperform traditional IE-based methods across diverse datasets? (2) Do all FMs perform equally well as IEs for subset selection? Extensive experiments uncovered surprising insights: FMs consistently outperform traditional IEs on fine-grained datasets, whereas their advantage diminishes on coarse-grained datasets with noisy labels. Motivated by these finding, we propose RAM-APL (RAnking Mean-Accuracy of Pseudo-class Labels), a method tailored for fine-grained image datasets. RAM-APL leverages multiple FMs to enhance subset selection by exploiting their complementary strengths. Our approach achieves state-of-the-art performance on fine-grained datasets, including Oxford-IIIT Pet, Food-101, and Caltech-UCSD Birds-200-2011.

Problem

Research questions and friction points this paper is trying to address.

Evaluating FM performance vs traditional IEs for subset selection

Assessing FM variability in one-shot subset selection effectiveness

Proposing multi-FM method for fine-grained dataset optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses foundation models for subset selection

Proposes RAM-APL for fine-grained datasets

Leverages multiple models for complementary strengths

🔎 Similar Papers

No similar papers found.