APPLV: Adaptive Planner Parameter Learning from Vision-Language-Action Model

📅 2026-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of autonomous navigation for mobile robots in highly constrained environments, where traditional methods rely heavily on manual parameter tuning and end-to-end learning struggles to balance accuracy with generalization. The authors propose a novel paradigm that leverages a pre-trained vision-language model augmented with a regression head to adaptively predict parameters for a classical motion planner—rather than directly outputting control actions—and jointly fine-tunes the system via supervised and reinforcement learning. This approach represents the first integration of a vision-language-action framework for dynamically configuring planner parameters, significantly enhancing both control accuracy and cross-environment generalization while preserving safety. Experiments demonstrate consistent superiority over existing methods in both the BARN simulation benchmark and real-world robotic platforms.

Technology Category

Application Category

📝 Abstract
Autonomous navigation in highly constrained environments remains challenging for mobile robots. Classical navigation approaches offer safety assurances but require environment-specific parameter tuning; end-to-end learning bypasses parameter tuning but struggles with precise control in constrained spaces. To this end, recent robot learning approaches automate parameter tuning while retaining classical systems' safety, yet still face challenges in generalizing to unseen environments. Recently, Vision-Language-Action (VLA) models have shown promise by leveraging foundation models' scene understanding capabilities, but still struggle with precise control and inference latency in navigation tasks. In this paper, we propose Adaptive Planner Parameter Learning from Vision-Language-Action Model (\textsc{applv}). Unlike traditional VLA models that directly output actions, \textsc{applv} leverages pre-trained vision-language models with a regression head to predict planner parameters that configure classical planners. We develop two training strategies: supervised learning fine-tuning from collected navigation trajectories and reinforcement learning fine-tuning to further optimize navigation performance. We evaluate \textsc{applv} across multiple motion planners on the simulated Benchmark Autonomous Robot Navigation (BARN) dataset and in physical robot experiments. Results demonstrate that \textsc{applv} outperforms existing methods in both navigation performance and generalization to unseen environments.
Problem

Research questions and friction points this paper is trying to address.

autonomous navigation
constrained environments
parameter tuning
generalization
precise control
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language-Action model
adaptive planner parameter learning
classical motion planning
generalization in navigation
foundation model for robotics
🔎 Similar Papers
No similar papers found.
Y
Yuanjie Lu
Department of Computer Science, George Mason University, Virginia, USA
Beichen Wang
Beichen Wang
PhD Candidate at Wageningen University & Research
Natural Language ProcessingInformation RetrievalComplex Network
Z
Zhengqi Wu
Department of Engineering Science, University of South Florida, Florida, USA
Y
Yang Li
Department of Computer Science, Rutgers University, New Jersey, USA
Xiaomin Lin
Xiaomin Lin
Assistant Prof, University of South Florida
AI for goodRobotics for scienceRobotics for good
Chengzhi Mao
Chengzhi Mao
Assistant Professor, Rutgers University
LLMComputer VisionMachine Learning
X
Xuesu Xiao
Department of Computer Science, George Mason University, Virginia, USA