SafeFlow: Real-Time Text-Driven Humanoid Whole-Body Control via Physics-Guided Rectified Flow and Selective Safety Gating

📅 2026-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing text-driven humanoid motion generation methods often produce infeasible or unsafe trajectories due to neglecting physical constraints, particularly when handling out-of-distribution instructions. This work proposes SafeFlow, a framework that leverages physics-guided rectified flow to generate feasible motions within the VAE latent space, augmented with a three-stage safety gate mechanism for hierarchical risk control. The gates sequentially filter semantically anomalous commands (using Mahalanobis distance), dynamically unstable generations (via a direction-sensitive discrepancy metric), and trajectories violating hard joint or velocity limits. Evaluated on the Unitree G1 platform, SafeFlow significantly outperforms existing diffusion-based approaches, achieving higher success rates, improved physical plausibility, and faster inference while preserving motion diversity.

Technology Category

Application Category

📝 Abstract
Recent advances in real-time interactive text-driven motion generation have enabled humanoids to perform diverse behaviors. However, kinematics-only generators often exhibit physical hallucinations, producing motion trajectories that are physically infeasible to track with a downstream motion tracking controller or unsafe for real-world deployment. These failures often arise from the lack of explicit physics-aware objectives for real-robot execution and become more severe under out-of-distribution (OOD) user inputs. Hence, we propose SafeFlow, a text-driven humanoid whole-body control framework that combines physics-guided motion generation with a 3-Stage Safety Gate driven by explicit risk indicators. SafeFlow adopts a two-level architecture. At the high level, we generate motion trajectories using Physics-Guided Rectified Flow Matching in a VAE latent space to improve real-robot executability, and further accelerate sampling via Reflow to reduce the number of function evaluations (NFE) for real-time control. The 3-Stage Safety Gate enables selective execution by detecting semantic OOD prompts using a Mahalanobis score in text-embedding space, filtering unstable generations via a directional sensitivity discrepancy metric, and enforcing final hard kinematic constraints such as joint and velocity limits before passing the generated trajectory to a low-level motion tracking controller. Extensive experiments on the Unitree G1 demonstrate that SafeFlow outperforms prior diffusion-based methods in success rate, physical compliance, and inference speed, while maintaining diverse expressiveness.
Problem

Research questions and friction points this paper is trying to address.

text-driven motion generation
physical feasibility
safety
out-of-distribution prompts
humanoid control
Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-Guided Rectified Flow
3-Stage Safety Gate
Real-Time Whole-Body Control
Out-of-Distribution Detection
Reflow Acceleration
🔎 Similar Papers
No similar papers found.
Hanbyel Cho
Hanbyel Cho
Research Scientist Intern, Meta Reality Labs
Computer VisionHuman Pose EstimationVirtual HumansMultimodal LearningGenerative Models
S
Sang-Hun Kim
Future Robot AI Group, Samsung Electronics
J
Jeonguk Kang
Future Robot AI Group, Samsung Electronics
D
Donghan Koo
Future Robot AI Group, Samsung Electronics