🤖 AI Summary
This work addresses the aerial juggling task of a quadrotor using a racket to hit and catch a ball—a challenge demanding millisecond-precision timing, high-dynamic contact stability, and strong environmental adaptability. We present the first application of deep reinforcement learning (DRL) to this nonlinear, underactuated, multimodal contact control problem. Our method integrates system dynamics modeling, hierarchical reward shaping, and domain randomization within a simulation-based training framework, while co-designing a low-level controller and a lightweight communication protocol. Real-world zero-shot deployment achieves an average of 311 consecutive hits (peak: 462), substantially outperforming conventional model-based approaches (3.1 hits). Moreover, without fine-tuning, the policy generalizes robustly to ultra-light 5g balls, sustaining an average of 145.9 hits—demonstrating exceptional robustness and cross-environment generalization capability.
📝 Abstract
Aerial robots interacting with objects must perform precise, contact-rich maneuvers under uncertainty. In this paper, we study the problem of aerial ball juggling using a quadrotor equipped with a racket, a task that demands accurate timing, stable control, and continuous adaptation. We propose JuggleRL, the first reinforcement learning-based system for aerial juggling. It learns closed-loop policies in large-scale simulation using systematic calibration of quadrotor and ball dynamics to reduce the sim-to-real gap. The training incorporates reward shaping to encourage racket-centered hits and sustained juggling, as well as domain randomization over ball position and coefficient of restitution to enhance robustness and transferability. The learned policy outputs mid-level commands executed by a low-level controller and is deployed zero-shot on real hardware, where an enhanced perception module with a lightweight communication protocol reduces delays in high-frequency state estimation and ensures real-time control. Experiments show that JuggleRL achieves an average of $311$ hits over $10$ consecutive trials in the real world, with a maximum of $462$ hits observed, far exceeding a model-based baseline that reaches at most $14$ hits with an average of $3.1$. Moreover, the policy generalizes to unseen conditions, successfully juggling a lighter $5$ g ball with an average of $145.9$ hits. This work demonstrates that reinforcement learning can empower aerial robots with robust and stable control in dynamic interaction tasks.