HUMOF: Human Motion Forecasting in Interactive Social Scenes

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Human motion prediction in complex social scenes suffers from high uncertainty and poor long-term trajectory accuracy due to multi-scale interactions—both inter-personal and person–environment. To address this, we propose a hierarchical interaction modeling framework that jointly exploits spatial and spectral domains. Our method introduces a hierarchical graph neural network coupled with a multi-scale interaction attention mechanism, uniquely integrating structural spatial topology with dynamic frequency-domain features. Additionally, we design a coarse-to-fine interaction reasoning module that enables progressive decoding—from global social context to fine-grained motion details. Evaluated on four standard benchmarks, our approach achieves state-of-the-art performance: average prediction errors over 1–3 seconds are significantly reduced, and the average displacement error (ADE) drops by 18.7% in high-density scenarios. The framework substantially improves both long-term prediction accuracy and robustness under complex social dynamics.

Technology Category

Application Category

📝 Abstract
Complex scenes present significant challenges for predicting human behaviour due to the abundance of interaction information, such as human-human and humanenvironment interactions. These factors complicate the analysis and understanding of human behaviour, thereby increasing the uncertainty in forecasting human motions. Existing motion prediction methods thus struggle in these complex scenarios. In this paper, we propose an effective method for human motion forecasting in interactive scenes. To achieve a comprehensive representation of interactions, we design a hierarchical interaction feature representation so that high-level features capture the overall context of the interactions, while low-level features focus on fine-grained details. Besides, we propose a coarse-to-fine interaction reasoning module that leverages both spatial and frequency perspectives to efficiently utilize hierarchical features, thereby enhancing the accuracy of motion predictions. Our method achieves state-of-the-art performance across four public datasets. Code will be released when this paper is published.
Problem

Research questions and friction points this paper is trying to address.

Predicting human motion in complex social scenes
Modeling human-human and human-environment interactions
Improving accuracy of motion forecasting methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical interaction feature representation
Coarse-to-fine interaction reasoning module
Leverages spatial and frequency perspectives
🔎 Similar Papers
No similar papers found.