๐ค AI Summary
Existing 3D human motion generation methods suffer from strong task specificity and poor generalization across heterogeneous interaction scenarios. To address this, we propose the Unified Interaction Voxel (UIV) frameworkโthe first approach enabling task-agnostic motion synthesis for human-human, human-object, and human-scene interactions. UIV maps diverse interaction entities (human bodies, objects, scenes) into a shared voxelized spatial representation and jointly models joint-level probabilistic predictions with fine-grained spatial dependencies to ensure relation-consistent composite interaction reasoning. By eliminating reliance on task-specific modules, UIV achieves state-of-the-art performance on three major interaction benchmarks and demonstrates significant zero-shot generalization to unseen entity combinations. This validates the effectiveness and robustness of the unified voxel-based representation paradigm for 3D interactive motion generation.
๐ Abstract
We present Uni-Inter, a unified framework for human motion generation that supports a wide range of interaction scenarios: including human-human, human-object, and human-scene-within a single, task-agnostic architecture. In contrast to existing methods that rely on task-specific designs and exhibit limited generalization, Uni-Inter introduces the Unified Interactive Volume (UIV), a volumetric representation that encodes heterogeneous interactive entities into a shared spatial field. This enables consistent relational reasoning and compound interaction modeling. Motion generation is formulated as joint-wise probabilistic prediction over the UIV, allowing the model to capture fine-grained spatial dependencies and produce coherent, context-aware behaviors. Experiments across three representative interaction tasks demonstrate that Uni-Inter achieves competitive performance and generalizes well to novel combinations of entities. These results suggest that unified modeling of compound interactions offers a promising direction for scalable motion synthesis in complex environments.