🤖 AI Summary
This work proposes a lightweight and efficient three-stage framework for self-supervised monocular depth estimation to address the challenges of high computational cost and insufficient detail preservation. By synergistically integrating local feature extraction, hierarchical feature enhancement, and global feature refinement, the method incorporates three novel components: Shuffle-Dilation Convolution (SDC), Rotation-Adaptive Kernel Attention (RAKA), and Depth Frequency-domain Signal Purification (DFSP). These modules collectively reduce model parameters significantly while substantially improving the accuracy of structural and fine-grained detail reconstruction. Experimental results demonstrate that the proposed approach achieves state-of-the-art performance with minimal computational overhead, maintaining both efficiency and high fidelity in depth prediction.
📝 Abstract
We propose PuriLight, a lightweight and efficient framework for self-supervised monocular depth estimation, to address the dual challenges of computational efficiency and detail preservation. While recent advances in self-supervised depth estimation have reduced reliance on ground truth supervision, existing approaches remain constrained by either bulky architectures compromising practicality or lightweight models sacrificing structural precision. These dual limitations underscore the critical need to develop lightweight yet structurally precise architectures. Our framework addresses these limitations through a three-stage architecture incorporating three novel modules: the Shuffle-Dilation Convolution (SDC) module for local feature extraction, the Rotation-Adaptive Kernel Attention (RAKA) module for hierarchical feature enhancement, and the Deep Frequency Signal Purification (DFSP) module for global feature purification. Through effective collaboration, these modules enable PuriLight to achieve both lightweight and accurate feature extraction and processing. Extensive experiments demonstrate that PuriLight achieves state-of-the-art performance with minimal training parameters while maintaining exceptional computational efficiency. Codes will be available at https://github.com/ishrouder/PuriLight.