Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods rely on depth or elevation maps, providing only local 2D planar perception—insufficient for robust navigation of humanoid robots in complex 3D constrained environments (e.g., multi-level stairs, narrow passages, lateral obstacles). To address this, we propose a voxel-grid-based full 3D environmental representation framework. For the first time, we voxelize LiDAR point clouds and apply a z-axis grouped 2D CNN for end-to-end perception–control co-learning. Integrated with a high-fidelity LiDAR simulation system, our approach enables globally consistent 3D structural modeling and policy optimization. Experiments demonstrate near-100% success rates on challenging tasks including stair climbing and platform ascent—significantly surpassing conventional ground-centric perception paradigms. Our method establishes a scalable, embodied intelligence navigation paradigm for complex 3D terrains.

Technology Category

Application Category

📝 Abstract
Robust humanoid locomotion requires accurate and globally consistent perception of the surrounding 3D environment. However, existing perception modules, mainly based on depth images or elevation maps, offer only partial and locally flattened views of the environment, failing to capture the full 3D structure. This paper presents Gallant, a voxel-grid-based framework for humanoid locomotion and local navigation in 3D constrained terrains. It leverages voxelized LiDAR data as a lightweight and structured perceptual representation, and employs a z-grouped 2D CNN to map this representation to the control policy, enabling fully end-to-end optimization. A high-fidelity LiDAR simulation that dynamically generates realistic observations is developed to support scalable, LiDAR-based training and ensure sim-to-real consistency. Experimental results show that Gallant's broader perceptual coverage facilitates the use of a single policy that goes beyond the limitations of previous methods confined to ground-level obstacles, extending to lateral clutter, overhead constraints, multi-level structures, and narrow passages. Gallant also firstly achieves near 100% success rates in challenging scenarios such as stair climbing and stepping onto elevated platforms through improved end-to-end optimization.
Problem

Research questions and friction points this paper is trying to address.

Achieving robust humanoid locomotion across complex 3D constrained terrains
Overcoming limitations of partial environmental perception in existing methods
Enabling navigation through lateral clutter, overhead constraints and narrow passages
Innovation

Methods, ideas, or system contributions that make the work stand out.

Voxel grid-based framework for humanoid locomotion
Z-grouped 2D CNN mapping perception to control policy
High-fidelity LiDAR simulation enabling sim-to-real transfer
🔎 Similar Papers
No similar papers found.
Qingwei Ben
Qingwei Ben
The Chinese University of Hong Kong
Robot LearningEmbodied AIHumanoidsQingwei
Botian Xu
Botian Xu
Tsinghua University
reinforcement learningrobotics
Kailin Li
Kailin Li
Shanghai AI Lab
Computer Vision3D VisionEmbodied AI
F
Feiyu Jia
Shanghai Artificial Intelligence Laboratory, University of Science and Technology of China
Wentao Zhang
Wentao Zhang
Institute of Physics, Chinese Academy of Sciences
photoemissionsuperconductivitycupratehtsctime-resolved
J
Jingping Wang
Shanghai Artificial Intelligence Laboratory, Shanghai Jiaotong University
J
Jingbo Wang
Shanghai Artificial Intelligence Laboratory
Dahua Lin
Dahua Lin
The Chinese University of Hong Kong
computer visionmachine learningprobabilistic inferencebayesian nonparametrics
J
Jiangmiao Pang
Shanghai Artificial Intelligence Laboratory