🤖 AI Summary
Current autonomous driving simulators predominantly rely on rigid-body vehicle models, which fail to capture component-level dynamic behaviors such as wheel steering and door articulation. This work proposes a method to generate animatable 3D Gaussian vehicle models from either a single image or sparse multi-view inputs. By introducing a part-specific Gaussian assignment mechanism coupled with a kinematic reasoning module, the approach achieves, for the first time within a generative 3D Gaussian framework, automatic estimation of hinge axes and joint locations. Integrating part segmentation, edge-aware optimization, and motion constraints, the method effectively prevents boundary distortions during animation, significantly enhancing the realism of component-level vehicle motion and overall simulation fidelity.
📝 Abstract
Simulation is essential for autonomous driving, yet current frameworks often model vehicles as rigid assets and fail to capture part-level articulation. With perception algorithms increasingly leveraging dynamics such as wheel steering or door opening, realistic simulation requires animatable vehicle representations. Existing CAD-based pipelines are limited by library coverage and fixed templates, preventing faithful reconstruction of in-the-wild instances.
We propose a generative framework that, from a single image or sparse multi-view input, synthesizes an animatable 3D Gaussian vehicle. Our method addresses two challenges: (i) large 3D asset generators are optimized for static quality but not articulation, leading to distortions at part boundaries when animated; and (ii) segmentation alone cannot provide the kinematic parameters required for motion. To overcome this, we introduce a part-edge refinement module that enforces exclusive Gaussian ownership and a kinematic reasoning head that predicts joint positions and hinge axes of movable parts. Together, these components enable faithful part-aware simulation, bridging the gap between static generation and animatable vehicle models.