VertiCoder: Self-Supervised Kinodynamic Representation Learning on Vertically Challenging Terrain

๐Ÿ“… 2024-09-17
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the representation learning challenge for legged robot autonomous navigation in complex vertical terrains, this paper proposes VertiCoderโ€”the first self-supervised, unified representation framework tailored for kinodynamic tasks. VertiCoder employs a masked Transformer encoder and leverages local spatiotemporal block reconstruction for label-free pretraining. It jointly supports four downstream tasks: forward/inverse dynamics modeling, behavior cloning, and terrain region reconstruction. Compared to task-specific end-to-end models, VertiCoder reduces parameter count by 77% while achieving superior performance across all tasks. When deployed on real robots, it matches state-of-the-art (SOTA) planning methods in navigation accuracy and efficiency, demonstrating strong generalization and cross-terrain robustness. Its unified architecture effectively mitigates overfitting, enabling reliable adaptation to unseen vertical environments without fine-tuning.

Technology Category

Application Category

๐Ÿ“ Abstract
We present VertiCoder, a self-supervised representation learning approach for robot mobility on vertically challenging terrain. Using the same pre-training process, VertiCoder can handle four different downstream tasks, including forward kinodynamics learning, inverse kinodynamics learning, behavior cloning, and patch reconstruction with a single representation. VertiCoder uses a TransformerEncoder to learn the local context of its surroundings by random masking and next patch reconstruction. We show that VertiCoder achieves better performance across all four different tasks compared to specialized End-to-End models with 77% fewer parameters. We also show VertiCoder's comparable performance against state-of-the-art kinodynamic modeling and planning approaches in real-world robot deployment. These results underscore the efficacy of VertiCoder in mitigating overfitting and fostering more robust generalization across diverse environmental contexts and downstream vehicle kinodynamic tasks.
Problem

Research questions and friction points this paper is trying to address.

Self-supervised learning for robot mobility on vertical terrain
Handles multiple tasks with a single representation
Improves performance with fewer parameters and robust generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised learning for robot mobility
TransformerEncoder for local context learning
Single representation for multiple downstream tasks
๐Ÿ”Ž Similar Papers
No similar papers found.