🤖 AI Summary
This work addresses the challenge of generating consistent multi-view SVGs from a single input view, specifically tackling cross-view geometric distortion and color inconsistency. The proposed method introduces an end-to-end framework grounded in 3D reconstruction, incorporating a cross-view spatial memory mechanism to establish part-level multi-view correspondences. It integrates an extended SAM2-based spatial memory bank, rasterization-assisted supervision, path merging, and structural optimization—enabling path and coloring consistency without retraining. Its key contribution is the first application of spatial memory mechanisms to multi-view vector graphics generation, uniquely balancing structural preservation with redundancy removal. Experiments demonstrate significant improvements over state-of-the-art methods in geometric fidelity, color consistency, and fine-detail retention, enabling high-quality vector asset creation and semantics-aware editing.
📝 Abstract
Scalable Vector Graphics (SVGs) are central to modern design workflows, offering scaling without distortion and precise editability. However, for single object SVGs, generating multi-view consistent SVGs from a single-view input remains underexplored. We present a three stage framework that produces multi-view SVGs with geometric and color consistency from a single SVG input. First, the rasterized input is lifted to a 3D representation and rendered under target camera poses, producing multi-view images of the object. Next, we extend the temporal memory mechanism of Segment Anything 2 (SAM2) to the spatial domain, constructing a spatial memory bank that establishes part level correspondences across neighboring views, yielding cleaner and more consistent vector paths and color assignments without retraining. Finally, during the raster to vector conversion, we perform path consolidation and structural optimization to reduce redundancy while preserving boundaries and semantics. The resulting SVGs exhibit strong geometric and color consistency across views, significantly reduce redundant paths, and retain fine structural details. This work bridges generative modeling and structured vector representation, providing a scalable route to single input, object level multi-view SVG generation and supporting applications such as asset creation and semantic vector editing.