ManiTwin: Scaling Data-Generation-Ready Digital Object Dataset to 100K

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses the scarcity of large-scale, diverse, and simulation-ready 3D digital assets for robot learning by introducing ManiTwin—an end-to-end automated pipeline that generates physically plausible, semantically annotated, linguistically described, and manipulation-verifiable 3D object twins from a single input image. Integrating image-to-3D reconstruction, semantic and functional analysis, physics-based modeling, and manipulation feasibility verification, the proposed method enables the creation of ManiTwin-100K, a novel dataset comprising 100,000 high-quality assets. This dataset substantially advances the scale and efficiency of simulation training data generation, thereby supporting a range of downstream tasks including robotic manipulation policy learning, stochastic scene synthesis, and vision-language question answering.

Technology Category

Application Category

📝 Abstract

Learning in simulation provides a useful foundation for scaling robotic manipulation capabilities. However, this paradigm often suffers from a lack of data-generation-ready digital assets, in both scale and diversity. In this work, we present ManiTwin, an automated and efficient pipeline for generating data-generation-ready digital object twins. Our pipeline transforms a single image into simulation-ready and semantically annotated 3D asset, enabling large-scale robotic manipulation data generation. Using this pipeline, we construct ManiTwin-100K, a dataset containing 100K high-quality annotated 3D assets. Each asset is equipped with physical properties, language descriptions, functional annotations, and verified manipulation proposals. Experiments demonstrate that ManiTwin provides an efficient asset synthesis and annotation workflow, and that ManiTwin-100K offers high-quality and diverse assets for manipulation data generation, random scene synthesis, and VQA data generation, establishing a strong foundation for scalable simulation data synthesis and policy learning. Our webpage is available at https://manitwin.github.io/.

Problem

Research questions and friction points this paper is trying to address.

simulation

robotic manipulation

digital assets

data generation

3D object dataset

Innovation

Methods, ideas, or system contributions that make the work stand out.

digital object twins

simulation-ready 3D assets

automated asset generation