🤖 AI Summary
Existing image vectorization methods process only visible pixels and disregard occlusion relationships, resulting in SVG outputs with ambiguous semantics, incomplete geometry, and limited editability. This work proposes a two-stage framework: first, a vision-language model–guided Semantic Layer Peeling (SLP) decouples and completes occluded objects in the raster domain; second, an error-budget–driven Adaptive Layered Vectorization (ALV) converts each semantic layer independently into vector graphics. The approach achieves, for the first time, complete geometric reconstruction and semantically layered vectorization of occluded objects in natural images. It produces SVGs with high visual fidelity, full structural integrity, and distinct semantic layers, enabling object-level vector editing and overcoming fundamental limitations of conventional vectorization techniques.
📝 Abstract
We introduce AmodalSVG, a new framework for amodal image vectorization that produces semantically organized and geometrically complete SVG representations from natural images. Existing vectorization methods operate under a modal paradigm: tracing only visible pixels and disregarding occlusion. Consequently, the resulting SVGs are semantically entangled and geometrically incomplete, limiting SVG's structural editability. In contrast, AmodalSVG reconstructs full object geometries, including occluded regions, into independent, editable vector layers. To achieve this, AmodalSVG reformulates image vectorization as a two-stage framework, performing semantic decoupling and completion in the raster domain to produce amodally complete semantic layers, which are then independently vectorized. In the first stage, we introduce Semantic Layer Peeling (SLP), a VLM-guided strategy that progressively decomposes an image into semantically coherent layers. By hybrid inpainting, SLP recovers complete object appearances under occlusions, enabling explicit semantic decoupling. To vectorize these layers efficiently, we propose Adaptive Layered Vectorization (ALV), which dynamically modulates the primitive budget via an error-budget-driven adjustment mechanism. Extensive experiments demonstrate that AmodalSVG significantly outperforms prior methods in visual fidelity. Moreover, the resulting amodal layers enable object-level editing directly in the vector domain, capabilities not supported by existing vectorization approaches. Code will be released upon acceptance.