REACT3D: Recovering Articulations for Interactive Physical 3D Scenes

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Existing interactive 3D scene datasets rely on labor-intensive manual annotation of part segmentation, kinematic types, and motion trajectories, limiting scale and incurring high costs. This paper introduces the first end-to-end zero-shot learning framework that, without any training data, jointly performs movable part detection and segmentation, joint type classification and motion parameter estimation, and implicit geometric completion directly from static 3D scenes—yielding physically plausible, interactive dynamic simulation environments. Our method supports standard-format export (e.g., USD, GLTF) and seamless cross-platform simulation integration. Evaluated on diverse indoor scenes, it achieves state-of-the-art performance across all core tasks: movable part detection, part segmentation, and joint parameter estimation. By eliminating annotation dependence and enabling fully automated, scalable construction of interactive 3D scenes, this work significantly advances the automation and scalability of dynamic scene generation, providing foundational support for embodied AI and immersive virtual interaction.

Technology Category

Application Category

📝 Abstract

Interactive 3D scenes are increasingly vital for embodied intelligence, yet existing datasets remain limited due to the labor-intensive process of annotating part segmentation, kinematic types, and motion trajectories. We present REACT3D, a scalable zero-shot framework that converts static 3D scenes into simulation-ready interactive replicas with consistent geometry, enabling direct use in diverse downstream tasks. Our contributions include: (i) openable-object detection and segmentation to extract candidate movable parts from static scenes, (ii) articulation estimation that infers joint types and motion parameters, (iii) hidden-geometry completion followed by interactive object assembly, and (iv) interactive scene integration in widely supported formats to ensure compatibility with standard simulation platforms. We achieve state-of-the-art performance on detection/segmentation and articulation metrics across diverse indoor scenes, demonstrating the effectiveness of our framework and providing a practical foundation for scalable interactive scene generation, thereby lowering the barrier to large-scale research on articulated scene understanding. Our project page is extit{hypersetup{urlcolor=black}href{https://react3d.github.io/}{react3d.github.io}}.

Problem

Research questions and friction points this paper is trying to address.

Converting static 3D scenes into interactive physical replicas

Automating articulation estimation for movable parts in scenes

Enabling scalable generation of simulation-ready interactive environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Detects openable objects and segments movable parts

Infers joint types and motion parameters automatically

Completes hidden geometry and assembles interactive objects

🔎 Similar Papers

Survey on Modeling of Human-made Articulated Objects