๐ค AI Summary
This work addresses the challenge of transforming static 3D meshes into interactive, simulation-ready digital twins, a task often hindered by kinematic hallucinations and self-collisions due to the neglect of physical constraints in existing approaches. To overcome this, the authors propose MotionAnymeshโa zero-shot automated framework that integrates SP4D physical priors with vision-language models to achieve kinematics-aware part segmentation and joint-type recognition. Building upon this semantic understanding, the method further enforces geometric and physical constraints to initialize joint parameters and optimize motion trajectories, ensuring collision-free and dynamically feasible results. Experimental evaluations demonstrate that MotionAnymesh significantly outperforms current methods in both geometric accuracy and physical plausibility, enabling efficient generation of high-fidelity, simulation-ready digital twin assets.
๐ Abstract
Converting static 3D meshes into interactable articulated assets is crucial for embodied AI and robotic simulation. However, existing zero-shot pipelines struggle with complex assets due to a critical lack of physical grounding. Specifically, ungrounded Vision-Language Models (VLMs) frequently suffer from kinematic hallucinations, while unconstrained joint estimation inevitably leads to catastrophic mesh inter-penetration during physical simulation. To bridge this gap, we propose MotionAnymesh, an automated zero-shot framework that seamlessly transforms unstructured static meshes into simulation-ready digital twins. Our method features a kinematic-aware part segmentation module that grounds VLM reasoning with explicit SP4D physical priors, effectively eradicating kinematic hallucinations. Furthermore, we introduce a geometry-physics joint estimation pipeline that combines robust type-aware initialization with physics-constrained trajectory optimization to rigorously guarantee collision-free articulation. Extensive experiments demonstrate that MotionAnymesh significantly outperforms state-of-the-art baselines in both geometric precision and dynamic physical executability, providing highly reliable assets for downstream applications.