RayMamba: Ray-Aligned Serialization for Long-Range 3D Object Detection

πŸ“… 2026-04-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of long-range 3D object detection, where LiDAR point clouds become extremely sparse and fragmented, hindering effective modeling of long-range contextual information. To this end, the authors propose RayMamba, a geometric-aware, plug-and-play module that introduces a novel ray-aligned voxel serialization strategy. This approach organizes sparse voxels into directionally continuous sector sequences, preserving occlusion-aware context and geometric structure. Built upon state space models (SSMs) and the Mamba architecture, RayMamba seamlessly integrates into both LiDAR-only and multimodal 3D detectors. Experiments demonstrate consistent performance gains: on the nuScenes dataset, it improves mAP by 2.49 and NDS by 1.59 in the 40–50 meter range, and on Argoverse 2, it boosts VoxelNeXt’s mAP from 30.3 to 31.2.
πŸ“ Abstract
Long-range 3D object detection remains challenging because LiDAR observations become highly sparse and fragmented in the far field, making reliable context modeling difficult for existing detectors. To address this issue, recent state space model (SSM)-based methods have improved long-range modeling efficiency. However, their effectiveness is still limited by generic serialization strategies that fail to preserve meaningful contextual neighborhoods in sparse scenes. To address this issue, we propose RayMamba, a geometry-aware plug-and-play enhancement for voxel-based 3D detectors. RayMamba organizes sparse voxels into sector-wise ordered sequences through a ray-aligned serialization strategy, which preserves directional continuity and occlusion-related context for subsequent Mamba-based modeling. It is compatible with both LiDAR-only and multimodal detectors, while introducing only modest overhead. Extensive experiments on nuScenes and Argoverse 2 demonstrate consistent improvements across strong baselines. In particular, RayMamba achieves up to 2.49 mAP and 1.59 NDS gain in the challenging 40--50 m range on nuScenes, and further improves VoxelNeXt on Argoverse 2 from 30.3 to 31.2 mAP.
Problem

Research questions and friction points this paper is trying to address.

long-range 3D object detection
LiDAR sparsity
context modeling
sparse scenes
3D detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ray-aligned serialization
State Space Model
Long-range 3D object detection
Sparse voxel modeling
Geometry-aware sequence ordering
πŸ”Ž Similar Papers
No similar papers found.
C
Cheng Lu
School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094, China
M
Mingqian Ji
School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094, China
S
Shanshan Zhang
School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094, China
Zhihao Li
Zhihao Li
The Hong Kong University of Science and Technology (Guangzhou)
AI for ScienceAI for PDEGraph Neural Networks
Jian Yang
Jian Yang
Prof. of Computer Science, Nanjing University of Science and Technology
Pattern RecognitionComputer VisionBiometrics