Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning

📅 2025-07-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-world datasets commonly suffer from heterogeneous data quality and sample redundancy. Existing dataset pruning methods largely rely on static heuristics or task-specific metrics, limiting their generalizability and robustness. To address this, we propose a dynamic dataset pruning framework that jointly models task difficulty estimation and cross-modal semantic consistency—marking the first such integration. Leveraging pre-trained multimodal foundation models (e.g., CLIP, Flamingo), our method generates fine-grained, supervision-free signals to guide adaptive sample selection via semantic alignment. It requires no human annotations or task-specific fine-tuning and supports cross-domain transfer. Extensive experiments demonstrate consistent improvements across multiple vision and multimodal benchmarks: +1.2–2.8% accuracy gain, 1.3–1.7× training speedup, and enhanced out-of-distribution robustness. Our approach establishes a general, efficient, and scalable paradigm for data-centric learning.

Technology Category

Application Category

📝 Abstract
Modern deep models are trained on large real-world datasets, where data quality varies and redundancy is common. Data-centric approaches such as dataset pruning have shown promise in improving training efficiency and model performance. However, most existing methods rely on static heuristics or task-specific metrics, limiting their robustness and generalizability across domains. In this work, we introduce a dynamic dataset pruning framework that adaptively selects training samples based on both task-driven difficulty and cross-modality semantic consistency. By incorporating supervision from pretrained multimodal foundation models, our approach captures training dynamics while effectively filtering out uninformative samples. Our work highlights the potential of integrating cross-modality alignment for robust sample selection, advancing data-centric learning toward more efficient and robust practices across application domains.
Problem

Research questions and friction points this paper is trying to address.

Dynamic dataset pruning for efficient training
Improving robustness via cross-modality consistency
Adaptive sample selection using multimodal guidance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic dataset pruning framework adaptively selects samples
Uses task difficulty and cross-modality consistency
Leverages pretrained multimodal models for supervision
🔎 Similar Papers
No similar papers found.