🤖 AI Summary
To address the challenge of simultaneously achieving high fidelity, consistency, and incrementality in geometric reconstruction and appearance rendering for LiDAR–vision SLAM, this paper proposes Point-Based Neural Maps—a unified representation integrating explicit point clouds, implicit signed distance fields (SDFs), and explicit differentiable Gaussian radiance fields. Bidirectional geometric–photometric consistency constraints enable mutual refinement between SDFs and Gaussian fields. Coupled with multi-modal joint optimization within a tightly integrated LiDAR–vision SLAM framework, our method constructs globally consistent, compact, and incrementally updatable joint representations for large-scale scenes. Experiments demonstrate significant improvements: novel-view synthesis PSNR increases by 2.1 dB; SDF reconstruction error decreases by 37%; absolute trajectory error (ATE) reduces by 28%; and mesh reconstruction completeness is substantially enhanced.
📝 Abstract
Robots require high-fidelity reconstructions of their environment for effective operation. Such scene representations should be both, geometrically accurate and photorealistic to support downstream tasks. While this can be achieved by building distance fields from range sensors and radiance fields from cameras, the scalable incremental mapping of both fields consistently and at the same time with high quality remains challenging. In this paper, we propose a novel map representation that unifies a continuous signed distance field and a Gaussian splatting radiance field within an elastic and compact point-based implicit neural map. By enforcing geometric consistency between these fields, we achieve mutual improvements by exploiting both modalities. We devise a LiDAR-visual SLAM system called PINGS using the proposed map representation and evaluate it on several challenging large-scale datasets. Experimental results demonstrate that PINGS can incrementally build globally consistent distance and radiance fields encoded with a compact set of neural points. Compared to the state-of-the-art methods, PINGS achieves superior photometric and geometric rendering at novel views by leveraging the constraints from the distance field. Furthermore, by utilizing dense photometric cues and multi-view consistency from the radiance field, PINGS produces more accurate distance fields, leading to improved odometry estimation and mesh reconstruction.