Bi-Level Reinforcement Learning Control for an Underactuated Blimp via Center-of-Mass Reconfiguration

📅 2026-05-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

199K/year
🤖 AI Summary
This work addresses the challenges of strong nonlinear coupling and insufficient control degrees of freedom in target tracking for an underactuated airship equipped only with two thrusters and a movable slider. To overcome these limitations, the authors propose a hierarchical reinforcement learning framework that explicitly decouples centroid reconfiguration from thrust control: an outer-loop policy optimizes task-driven centroid placement, while an inner-loop policy generates thrust commands to track straight-line trajectories. The architecture integrates a two-stage training strategy with a nonlinear dynamics model and includes a convergence analysis. Experimental results on a test set of 27 targets demonstrate that the proposed method significantly outperforms both a fixed-centroid baseline and a PID controller, achieving higher tracking accuracy, enhanced robustness, and reliable sim-to-real transfer capability.
📝 Abstract
This paper investigates goal-directed tracking control of underactuated blimps with center-of-mass (CoM) reconfiguration. Unlike conventional overactuated blimp designs that rely on redundant actuation for simplified control, this paper focuses on a compact architecture consisting of two thrusters and a movable internal slider, aiming to improve energy efficiency and payload capacity. This hardware-efficient configuration introduces significant underactuation and strong nonlinear coupling between CoM dynamics and vehicle motion. To address these challenges, this paper proposes a bi-level reinforcement learning framework that explicitly decouples task-level CoM planning from continuous thrust control. The outer policy determines a target-dependent CoM configuration prior to flight, while the inner policy generates thrust commands to track straight-line references. To ensure stable learning, this paper introduces a two-stage learning strategy, supported by a convergence analysis of the resulting bi-level process. Extensive simulations and real-world experiments on a 27-goal evaluation set demonstrate that the proposed method consistently outperforms fixed-CoM baselines and PID-based controllers, achieving higher tracking accuracy, enhanced robustness, and reliable sim-to-real transfer.
Problem

Research questions and friction points this paper is trying to address.

underactuated blimp
center-of-mass reconfiguration
goal-directed tracking control
nonlinear coupling
hardware-efficient configuration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-Level Reinforcement Learning
Underactuated Blimp
Center-of-Mass Reconfiguration
Nonlinear Coupling
Sim-to-Real Transfer
🔎 Similar Papers
No similar papers found.
Xiaorui Wang
Xiaorui Wang
Professor of Computer Engineering, The Ohio State University
Power ManagementData CentersReal-Time Embedded SystemsComputer ArchitectureComputer Systems
H
Hongwu Wang
Robotics and Control Laboratory, School of Advanced Manufacturing and Robotics, and the State Key Laboratory of Turbulence and Complex Systems, Peking University, Beijing, 100871, China
Y
Yue Fan
Robotics and Control Laboratory, School of Advanced Manufacturing and Robotics, and the State Key Laboratory of Turbulence and Complex Systems, Peking University, Beijing, 100871, China
Hao Cheng
Hao Cheng
Tsinghua University
Safety Evaluation of AVs
Feitian Zhang
Feitian Zhang
Associate Professor, Peking University
Underwater VehiclesAerial VehiclesBioinspired RoboticsControl SystemsArtificial Intelligence