AnyUp: Universal Feature Upsampling

📅 2025-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing learning-based feature upsampling methods require encoder-specific training for diverse vision encoders (e.g., DINO, CLIP), suffering from poor generalizability. This paper proposes AnyUp—the first inference-time plug-and-play, feature-agnostic universal upsampling framework. Its core innovation lies in decoupling feature representation from the upsampling process: a lightweight network, channel-wise normalization, and multi-scale reconstruction jointly enable high-fidelity upsampling of arbitrary-resolution features across heterogeneous architectures—without relying on encoder-specific priors. Extensive experiments demonstrate that AnyUp achieves state-of-the-art performance on downstream tasks including semantic segmentation and object detection. It significantly improves semantic fidelity and computational efficiency while exhibiting strong cross-encoder generalization and practical deployability.

Technology Category

Application Category

📝 Abstract
We introduce AnyUp, a method for feature upsampling that can be applied to any vision feature at any resolution, without encoder-specific training. Existing learning-based upsamplers for features like DINO or CLIP need to be re-trained for every feature extractor and thus do not generalize to different feature types at inference time. In this work, we propose an inference-time feature-agnostic upsampling architecture to alleviate this limitation and improve upsampling quality. In our experiments, AnyUp sets a new state of the art for upsampled features, generalizes to different feature types, and preserves feature semantics while being efficient and easy to apply to a wide range of downstream tasks.
Problem

Research questions and friction points this paper is trying to address.

Universal feature upsampling for any vision feature
Eliminates encoder-specific training requirements
Improves feature generalization across different types
Innovation

Methods, ideas, or system contributions that make the work stand out.

Universal feature upsampling without encoder-specific training
Inference-time feature-agnostic upsampling architecture
Generalizes across feature types while preserving semantics
🔎 Similar Papers
No similar papers found.