InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation

📅 2025-09-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current 3D human–object interaction (HOI) generation suffers from data scarcity, insufficient annotation, and pervasive artifacts—including interpenetration, floating, and hand distortion. To address these challenges, we propose a Contact-Invariant HOI Generation framework. First, we introduce the first large-scale, high-quality HOI dataset (30.7 hours), featuring fine-grained contact annotations and paired textual descriptions. Second, grounded in the principle of contact invariance, we design a unified optimization pipeline that jointly performs contact repair, hand pose correction, and semantic enhancement. Third, we define six standardized HOI benchmark tasks to establish an evaluable generation paradigm. Our method achieves state-of-the-art performance across all benchmarks. The dataset is publicly released and actively maintained.

Technology Category

Application Category

📝 Abstract
While large-scale human motion capture datasets have advanced human motion generation, modeling and generating dynamic 3D human-object interactions (HOIs) remain challenging due to dataset limitations. Existing datasets often lack extensive, high-quality motion and annotation and exhibit artifacts such as contact penetration, floating, and incorrect hand motions. To address these issues, we introduce InterAct, a large-scale 3D HOI benchmark featuring dataset and methodological advancements. First, we consolidate and standardize 21.81 hours of HOI data from diverse sources, enriching it with detailed textual annotations. Second, we propose a unified optimization framework to enhance data quality by reducing artifacts and correcting hand motions. Leveraging the principle of contact invariance, we maintain human-object relationships while introducing motion variations, expanding the dataset to 30.70 hours. Third, we define six benchmarking tasks and develop a unified HOI generative modeling perspective, achieving state-of-the-art performance. Extensive experiments validate the utility of our dataset as a foundational resource for advancing 3D human-object interaction generation. To support continued research in this area, the dataset is publicly available at https://github.com/wzyabcas/InterAct, and will be actively maintained.
Problem

Research questions and friction points this paper is trying to address.

Addressing dataset limitations in 3D human-object interaction generation
Reducing artifacts like contact penetration and incorrect hand motions
Creating large-scale benchmark with unified optimization framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Consolidated diverse HOI data with annotations
Optimized framework reducing artifacts and hand motions
Defined benchmarking tasks and unified generative modeling
🔎 Similar Papers
No similar papers found.
Sirui Xu
Sirui Xu
University of Illinois at Urbana-Champaign
Computer VisionMachine LearningVirtual HumansCharacter AnimationHuman-Object Interaction
D
Dongting Li
University of Illinois Urbana-Champaign
Yucheng Zhang
Yucheng Zhang
Purdue University
Knowledge GraphLarge Language Models
X
Xiyan Xu
University of Illinois Urbana-Champaign
Qi Long
Qi Long
Professor, University of Pennsylvania
Data ScienceBiostatisticsMachine LearningArtificial Intelligence
Z
Ziyin Wang
University of Illinois Urbana-Champaign
Y
Yunzhi Lu
University of Illinois Urbana-Champaign
S
Shuchang Dong
University of Illinois Urbana-Champaign
H
Hezi Jiang
University of Illinois Urbana-Champaign
Akshat Gupta
Akshat Gupta
UC Berkeley
Knowledge EditingNatural Language ProcessingSpoken Language Modeling
Y
Yu-Xiong Wang
University of Illinois Urbana-Champaign
L
Liang-Yan Gui
University of Illinois Urbana-Champaign