IGAF: Incremental Guided Attention Fusion for Depth Super-Resolution

📅 2024-12-24
🏛️ Italian National Conference on Sensors
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the loss of fine details in low-resolution depth maps, this paper proposes an RGB-guided depth super-resolution method. The core innovation is the Incremental Guidance Attention Fusion (IGAF) module, which dynamically models cross-modal feature dependencies between RGB and depth inputs, enabling robust multi-scale reconstruction (×4/×8/×16) and zero-shot generalization. Built upon a deep CNN architecture, the method integrates channel-spatial joint attention, progressive feature alignment, and multi-scale supervised losses. On NYU v2, it achieves state-of-the-art performance. Crucially, without retraining, it demonstrates strong zero-shot transfer capability—outperforming all existing baselines on Middlebury, Lu, and RGB-D-D datasets. This significantly enhances depth estimation accuracy in practical applications such as robotic perception, navigation, and medical imaging.

Technology Category

Application Category

📝 Abstract
Accurate depth estimation is crucial for many fields, including robotics, navigation, and medical imaging. However, conventional depth sensors often produce low-resolution (LR) depth maps, making detailed scene perception challenging. To address this, enhancing LR depth maps to high-resolution (HR) ones has become essential, guided by HR-structured inputs like RGB or grayscale images. We propose a novel sensor fusion methodology for guided depth super-resolution (GDSR), a technique that combines LR depth maps with HR images to estimate detailed HR depth maps. Our key contribution is the Incremental guided attention fusion (IGAF) module, which effectively learns to fuse features from RGB images and LR depth maps, producing accurate HR depth maps. Using IGAF, we build a robust super-resolution model and evaluate it on multiple benchmark datasets. Our model achieves state-of-the-art results compared to all baseline models on the NYU v2 dataset for ×4, ×8, and ×16 upsampling. It also outperforms all baselines in a zero-shot setting on the Middlebury, Lu, and RGB-D-D datasets. Code, environments, and models are available on GitHub.
Problem

Research questions and friction points this paper is trying to address.

Depth Image Enhancement
High Resolution Imaging
Deep Learning for Robotics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Incremental Guided Attention Fusion (IGAF)
Depth Super-Resolution
State-of-the-art Performance
🔎 Similar Papers
No similar papers found.