VectorArk: Learning Practical Image Vectorization with Rounded Polygon Representation

πŸ“… 2026-05-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing vision-language model–based image vectorization methods perform well on synthetic data but exhibit limited generalization in real-world scenarios, such as images produced by unknown rasterization processes or text-to-image generation. To address this, this work proposes VectorArk, a practical vectorization approach that introduces a rounded polygon parameterization to naturally generate smooth visual primitives, thereby simplifying the learning process. Additionally, it incorporates an image degradation model to enhance robustness against imperfect and diverse inputs. Experiments demonstrate that VectorArk significantly outperforms current methods across multiple datasets, excelling particularly in geometric completeness and artifact suppression. Ablation studies further confirm the effectiveness of each proposed component.
πŸ“ Abstract
Recent vision-language model (VLM)-based approaches have achieved impressive results on image vectorization tasks. However, they are typically evaluated on synthetic benchmarks, where clean SVGs are rasterized at high resolution and then re-vectorized. As a result, these methods generalize poorly to real-world scenarios, such as images with unknown rasterization methods or those generated by text-to-image models. We introduce VectorArk, a new VLM-based model designed for robust and practical image vectorization. VectorArk employs a novel rounded polygon representation that simplifies the learning process while naturally producing smooth, visually appealing primitives. We also propose a degradation model that enhances robustness across diverse and imperfect inputs. Our experiments show that, in contrast to previous methods, VectorArk achieves superior geometric completeness and artifact suppression across multiple datasets, with comprehensive ablations validating the contribution of each component.
Problem

Research questions and friction points this paper is trying to address.

image vectorization
vision-language model
real-world generalization
rasterization artifacts
SVG reconstruction
Innovation

Methods, ideas, or system contributions that make the work stand out.

rounded polygon representation
image vectorization
vision-language model
degradation model
robustness