MonoRelief V2: Leveraging Real Data for High-Fidelity Monocular Relief Recovery

📅 2025-08-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Single-image 2.5D relief (depth/normal) estimation suffers from poor robustness under complex materials and lighting conditions, compounded by severe scarcity of authentic ground-truth annotations. Method: This paper proposes a pseudo-real–real collaborative end-to-end learning framework. It innovatively leverages text-to-image diffusion models to synthesize large-scale, diverse pseudo-real image–relief pairs; integrates joint depth-normal modeling with multi-view reconstruction to generate high-fidelity pseudo-labels; and incorporates a small set of real captured data via a progressive training strategy. Contribution/Results: The method achieves state-of-the-art performance on both depth and normal prediction benchmarks, significantly outperforming existing approaches. It demonstrates strong generalization and practical utility in downstream applications—including texture transfer and relighting—validating its effectiveness in realistic scenarios.

Technology Category

Application Category

📝 Abstract

This paper presents MonoRelief V2, an end-to-end model designed for directly recovering 2.5D reliefs from single images under complex material and illumination variations. In contrast to its predecessor, MonoRelief V1 [1], which was solely trained on synthetic data, MonoRelief V2 incorporates real data to achieve improved robustness, accuracy and efficiency. To overcome the challenge of acquiring large-scale real-world dataset, we generate approximately 15,000 pseudo real images using a text-to-image generative model, and derive corresponding depth pseudo-labels through fusion of depth and normal predictions. Furthermore, we construct a small-scale real-world dataset (800 samples) via multi-view reconstruction and detail refinement. MonoRelief V2 is then progressively trained on the pseudo-real and real-world datasets. Comprehensive experiments demonstrate its state-of-the-art performance both in depth and normal predictions, highlighting its strong potential for a range of downstream applications. Code is at: https://github.com/glp1001/MonoreliefV2.

Problem

Research questions and friction points this paper is trying to address.

Recovers 2.5D reliefs from single images

Handles complex material and illumination variations

Overcomes lack of large-scale real-world datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pseudo-real images from generative model

Incorporates real-world dataset via reconstruction

Progressively trains on combined datasets

🔎 Similar Papers

No similar papers found.

Authors to Follow