Non-Uniform Class-Wise Coreset Selection: Characterizing Category Difficulty for Data-Efficient Transfer Learning

📅 2025-04-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing core-set selection methods predominantly rely on instance-level difficulty estimation, neglecting inter-class variations and thereby underrepresenting tail classes in long-tailed distributions. Method: This paper proposes a non-uniform class-level core-set selection framework that introduces, for the first time, a quantitative mechanism for class-level difficulty. It jointly models class difficulty and instance importance to enable inter-class adaptive budget allocation and difficulty-interval-aware sample selection. A theoretical analysis proves the convergence and representativeness advantages of the proposed adaptive sampling strategy. Results: Extensive experiments across 14 benchmarks and multiple architectures demonstrate significant improvements over state-of-the-art methods: on CIFAR-100 and Food101, the method achieves full-dataset accuracy using only 30% of samples, reducing computational overhead by 60%. It effectively mitigates data redundancy and class imbalance in large-scale transfer learning scenarios.

Technology Category

Application Category

📝 Abstract
As transfer learning models and datasets grow larger, efficient adaptation and storage optimization have become critical needs. Coreset selection addresses these challenges by identifying and retaining the most informative samples, constructing a compact subset for target domain training. However, current methods primarily rely on instance-level difficulty assessments, overlooking crucial category-level characteristics and consequently under-representing minority classes. To overcome this limitation, we propose Non-Uniform Class-Wise Coreset Selection (NUCS), a novel framework that integrates both class-level and instance-level criteria. NUCS automatically allocates data selection budgets for each class based on intrinsic category difficulty and adaptively selects samples within optimal difficulty ranges. By explicitly incorporating category-specific insights, our approach achieves a more balanced and representative coreset, addressing key shortcomings of prior methods. Comprehensive theoretical analysis validates the rationale behind adaptive budget allocation and sample selection, while extensive experiments across 14 diverse datasets and model architectures demonstrate NUCS's consistent improvements over state-of-the-art methods, achieving superior accuracy and computational efficiency. Notably, on CIFAR100 and Food101, NUCS matches full-data training accuracy while retaining just 30% of samples and reducing computation time by 60%. Our work highlights the importance of characterizing category difficulty in coreset selection, offering a robust and data-efficient solution for transfer learning.
Problem

Research questions and friction points this paper is trying to address.

Improves coreset selection by considering class-level difficulty
Addresses under-representation of minority classes in transfer learning
Enhances accuracy and efficiency with adaptive sample selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-Uniform Class-Wise Coreset Selection (NUCS)
Integrates class-level and instance-level criteria
Adaptive budget allocation based on category difficulty
🔎 Similar Papers
No similar papers found.