PhyGenHOI: Physically-Aware 4D Generation of Dynamic Human-Object Interactions

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a method for generating physically plausible and visually realistic 4D human-object interaction scenes. Addressing limitations in existing approaches regarding temporal consistency, momentum transfer, and contact fidelity, the method couples semantic human agents with physical object agents: human motion is driven by a Motion Diffusion Model (MDM), while object dynamics are simulated using the Material Point Method (MPM), both unified under a differentiable 3D Gaussian Splatting representation. Temporal synchronization and physical consistency of complex, text-driven interactions—such as hitting or kicking—are achieved through a windowed attraction loss, contact-driven re-simulation, and a masked video Score Distillation Sampling (SDS) objective. Experiments demonstrate that the proposed approach significantly outperforms current baselines across diverse human-object interaction scenarios.
📝 Abstract
We address the task of generating physically accurate and visually faithful 4D Human-Object Interaction (HOI). Given a static 3D human and target object represented as 3D Gaussian Splats (3DGS), our goal is to synthesize dynamic scenes where the human actively engages with the object through actions, such as punching or kicking, in accordance with a given input text. To this end, we introduce PhyGenHOI, a novel framework that couples generative human motion with an explicit physical object simulation. We model the human as a semantic agent driven by a Motion Diffusion Model (MDM) and the object as a physical agent simulated via the Material Point Method (MPM), utilizing 3D Gaussians as a unified, differentiable representation. We supervise their interaction through three coupled mechanisms: (1) A Windowed Attraction Loss that temporally synchronizes generative motion to intercept the object; (2) A Contact-Driven Re-simulation step that triggers physically consistent momentum transfer upon impact; and (3) A Masked Video-SDS objective that injects video-based priors to enhance contact fidelity. Experiments show PhyGenHOI generates physically consistent 4D HOI across diverse actions, humans, and objects, outperforming baselines. Project page and videos: https://omerbenishu.github.io/PhyGenHOI/
Problem

Research questions and friction points this paper is trying to address.

4D Human-Object Interaction
Physically Accurate Generation
Dynamic Scene Synthesis
Text-to-Motion
Physical Simulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

4D Human-Object Interaction
Physical Simulation
Motion Diffusion Model
Material Point Method
Differentiable Rendering
🔎 Similar Papers
No similar papers found.