๐ค AI Summary
In resource-constrained robotic vision systems, conventional image signal processing (ISP) pipelines incur substantial hardware overhead and high memory consumption when converting raw Bayer images to RGB for downstream tasks. Method: This paper proposes, for the first time, an end-to-end keypoint detection and local feature description framework operating directly on raw Bayer dataโbypassing ISP entirely. We design two specialized convolutional kernels that preserve the inherent channel structure of Bayer mosaics, enabling native raw-domain feature learning without RGB demosaicing. A lightweight network architecture is developed to support ISP-free processing. Contribution/Results: Experiments demonstrate significant improvements in keypoint detection accuracy and matching robustness under large viewpoint rotations and scale variations. Our method outperforms state-of-the-art RGB-domain approaches while reducing computational and memory footprint, making it highly suitable for embedded robotic vision applications.
๐ Abstract
Keypoint detection and local feature description are fundamental tasks in robotic perception, critical for applications such as SLAM, robot localization, feature matching, pose estimation, and 3D mapping. While existing methods predominantly operate on RGB images, we propose a novel network that directly processes raw images, bypassing the need for the Image Signal Processor (ISP). This approach significantly reduces hardware requirements and memory consumption, which is crucial for robotic vision systems. Our method introduces two custom-designed convolutional kernels capable of performing convolutions directly on raw images, preserving inter-channel information without converting to RGB. Experimental results show that our network outperforms existing algorithms on raw images, achieving higher accuracy and stability under large rotations and scale variations. This work represents the first attempt to develop a keypoint detection and feature description network specifically for raw images, offering a more efficient solution for resource-constrained environments.