RePer-360: Releasing Perspective Priors for 360$^\circ$ Depth Estimation via Self-Modulation

📅 2026-03-06

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses the limited generalization of existing depth models—pretrained on perspective images—when applied to 360° panoramic imagery, a challenge exacerbated by the data-intensive nature of full fine-tuning. To overcome this, we propose a lightweight self-modulation framework that uniquely integrates viewpoint prior preservation with panoramic domain adaptation. Our approach leverages a geometry-aligned guidance module and a self-conditioned AdaLN-Zero mechanism to effectively transfer perspective priors using only 1% of panoramic training data. By introducing dual modulation based on equirectangular projection (ERP) and cubemap projection (CP), along with a cross-projection consistency loss, our method achieves approximately 20% lower RMSE than standard fine-tuning under identical training conditions, significantly reducing data requirements while enhancing cross-projection consistency.

Technology Category

Application Category

📝 Abstract

Recent depth foundation models trained on perspective imagery achieve strong performance, yet generalize poorly to 360$^\circ$ images due to the substantial geometric discrepancy between perspective and panoramic domains. Moreover, fully fine-tuning these models typically requires large amounts of panoramic data. To address this issue, we propose RePer-360, a distortion-aware self-modulation framework for monocular panoramic depth estimation that adapts depth foundation models while preserving powerful pretrained perspective priors. Specifically, we design a lightweight geometry-aligned guidance module to derive a modulation signal from two complementary projections (i.e., ERP and CP) and use it to guide the model toward the panoramic domain without overwriting its pretrained perspective knowledge. We further introduce a Self-Conditioned AdaLN-Zero mechanism that produces pixel-wise scaling factors to reduce the feature distribution gap between the perspective and panoramic domains. In addition, a cubemap-domain consistency loss further improves training stability and cross-projection alignment. By shifting the focus from complementary-projection fusion to panoramic domain adaptation under preserved pretrained perspective priors, RePer-360 surpasses standard fine-tuning methods while using only 1\% of the training data. Under the same in-domain training setting, it further achieves an approximately 20\% improvement in RMSE. Code will be released upon acceptance.

Problem

Research questions and friction points this paper is trying to address.

360° depth estimation

domain adaptation

perspective priors

geometric discrepancy

monocular depth

Innovation

Methods, ideas, or system contributions that make the work stand out.

self-modulation

domain adaptation

360-degree depth estimation