HumanFlow -- Diffusion-Driven MAV Navigation Among Humans via Tightly-Coupled Motion Tracking, Forecasting, and Control

๐Ÿ“… 2026-05-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of accurately predicting human motion under severe occlusion or partial observability, a key limitation for safe and efficient micro aerial vehicle (MAV) navigation in crowded environments. The authors propose HumanFlow, the first approach to leverage latent diffusion models for joint human motion tracking and prediction while incorporating 3D scene context. By approximating model predictive control (MPC) through flow matching, HumanFlow achieves tight coupling between perception and control. Experimental results demonstrate that the method significantly outperforms existing approaches under partially observable conditions, exhibiting robust performance in tracking accuracy, computational efficiency, and collision-free social navigation.
๐Ÿ“ Abstract
Robust and accurate perception of humans in their 3D scene context is essential for integrating robots into everyday environments. Existing approaches, however, often fail to predict plausible and accurate human motion estimates that are consistent with the surrounding scene, especially in the presence of heavy occlusions or partial visibility. This can limit both safety and efficiency for robotic operations. We introduce HumanFlow, a latent diffusion model that unifies human motion tracking and forecasting, conditioned on the 3D scene context. We show that our human motion model produces smooth and accurate predictions under challenging conditions, including heavy occlusions, and outperforms state-of-the-art methods in tracking accuracy while being significantly more efficient. Furthermore, we show how HumanFlow's latent space can be tightly coupled with control by conditioning a flow-matching-based, approximate MPC policy on these representations. We validate our policy in simulation with real human trajectories for MAV social navigation, demonstrating superior navigation performance and remaining collision-free, even under partial observability of the human.
Problem

Research questions and friction points this paper is trying to address.

human motion prediction
occlusion
3D scene context
robot navigation
partial observability
Innovation

Methods, ideas, or system contributions that make the work stand out.

latent diffusion model
human motion forecasting
tightly-coupled control
social navigation
3D scene context
๐Ÿ”Ž Similar Papers