From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Dense prediction methods suffer from poor generalization in real-world scenarios and are hindered by scarce high-quality annotated data. To address these challenges, we introduce DenseWorld—the first unified benchmark for real-world dense prediction, encompassing 25 practical application domains. We further propose DenseDiT, an extremely lightweight universal model that leverages pretrained generative visual priors, parameter sharing, and a dual lightweight branch architecture to enable adaptive multi-scale contextual fusion. DenseDiT introduces fewer than 0.1% additional parameters and supports efficient fine-tuning with less than 0.01% labeled data. On DenseWorld, DenseDiT significantly outperforms both general-purpose and task-specific baselines across diverse dense prediction tasks, demonstrating strong cross-domain generalization and practical deployability in resource-constrained real-world settings.

Technology Category

Application Category

📝 Abstract
Dense prediction tasks hold significant importance of computer vision, aiming to learn pixel-wise annotated label for an input image. Despite advances in this field, existing methods primarily focus on idealized conditions, with limited generalization to real-world scenarios and facing the challenging scarcity of real-world data. To systematically study this problem, we first introduce DenseWorld, a benchmark spanning a broad set of 25 dense prediction tasks that correspond to urgent real-world applications, featuring unified evaluation across tasks. Then, we propose DenseDiT, which maximally exploits generative models' visual priors to perform diverse real-world dense prediction tasks through a unified strategy. DenseDiT combines a parameter-reuse mechanism and two lightweight branches that adaptively integrate multi-scale context, working with less than 0.1% additional parameters. Evaluations on DenseWorld reveal significant performance drops in existing general and specialized baselines, highlighting their limited real-world generalization. In contrast, DenseDiT achieves superior results using less than 0.01% training data of baselines, underscoring its practical value for real-world deployment. Our data, and checkpoints and codes are available at https://xcltql666.github.io/DenseDiTProj
Problem

Research questions and friction points this paper is trying to address.

Addressing limited generalization in dense prediction for real-world scenarios
Overcoming data scarcity in real-world dense prediction tasks
Unifying diverse dense prediction tasks with a single efficient model
Innovation

Methods, ideas, or system contributions that make the work stand out.

DenseDiT uses generative models' visual priors
Unified strategy with parameter-reuse mechanism
Lightweight branches integrate multi-scale context