SVGen: Interpretable Vector Graphics Generation with Large Language Models

📅 2025-08-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses three key challenges in natural language-to-SVG generation: poor semantic alignment, low structural integrity, and weak interpretability. To tackle these, we propose SVGen—a novel end-to-end generative model—and introduce SVG-1M, a large-scale, high-quality dataset specifically curated for this task. Methodologically, we innovatively incorporate chain-of-thought (CoT) annotations and curriculum learning, integrated with large language model fine-tuning, synthetic data augmentation, and reinforcement learning optimization to enhance joint text understanding and SVG syntax generation. Experimental results demonstrate that SVGen significantly outperforms both general-purpose large language models and conventional rendering-based approaches across accuracy, structural validity, and rendering fidelity. We publicly release the model, source code, and dataset to foster reproducible research. This work establishes a new paradigm for UI design automation and interpretable, AI-driven vector graphic generation.

Technology Category

Application Category

📝 Abstract
Scalable Vector Graphics (SVG) is widely used in front-end development and UI/UX design due to its scalability, editability, and rendering efficiency. However, turning creative ideas into precise vector graphics remains a time-consuming challenge. To address this, we introduce SVG-1M, a large-scale dataset of high-quality SVGs paired with natural language descriptions. Through advanced data augmentation and annotation, we create well-aligned Text to SVG training pairs, including a subset with Chain of Thought annotations for enhanced semantic guidance. Based on this dataset, we propose SVGen, an end-to-end model that generates SVG code from natural language inputs. Our approach ensures semantic accuracy and structural completeness, supported by curriculum learning and reinforcement learning optimization. Experiments show that SVGen outperforms general large models and traditional rendering methods in both effectiveness and efficiency. Code, model, and dataset are available on GitHub.
Problem

Research questions and friction points this paper is trying to address.

Turning creative ideas into precise vector graphics efficiently
Generating SVG code from natural language inputs accurately
Ensuring semantic accuracy and structural completeness in SVG generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale SVG dataset with text descriptions
End-to-end SVG generation from natural language
Curriculum and reinforcement learning optimization
🔎 Similar Papers
No similar papers found.
Feiyu Wang
Feiyu Wang
Fudan University
computer vision
Z
Zhiyuan Zhao
Institute of Artificial Intelligence (TeleAI), China Telecom, Beijing, China
Y
Yuandong Liu
Northwestern Polytechnical University, Xi’an, China
D
Da Zhang
Northwestern Polytechnical University, Xi’an, China; Institute of Artificial Intelligence (TeleAI), China Telecom, Beijing, China
J
Junyu Gao
Northwestern Polytechnical University, Xi’an, China; Institute of Artificial Intelligence (TeleAI), China Telecom, Beijing, China
H
Hao Sun
Institute of Artificial Intelligence (TeleAI), China Telecom, Beijing, China
X
Xuelong Li
Institute of Artificial Intelligence (TeleAI), China Telecom, Beijing, China