SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation

📅 2025-06-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing SVG benchmarks suffer from narrow real-world coverage, lack of complexity stratification, and fragmented evaluation dimensions, hindering systematic assessment of LLMs/MMLMs on SVG understanding, editing, and generation. Method: We introduce SVGenius—the first comprehensive SVG-processing benchmark—comprising 2,377 real-world queries, eight task categories, 18 fine-grained metrics, coverage across 24 application domains, and explicit complexity stratification. We propose a novel unified “understanding-editing-generation” tri-dimensional evaluation framework. Contribution/Results: SVGenius reveals a pronounced performance degradation in models with increasing SVG complexity; demonstrates that reasoning-augmented training outperforms mere parameter scaling; and identifies style transfer as the fundamental bottleneck. Cross-architectural evaluation of 22 state-of-the-art models shows: (i) closed-source models lead overall, yet all exhibit sharp performance drops on high-complexity tasks; (ii) reasoning-augmented variants yield significant gains; and (iii) style transfer accuracy remains the lowest, demanding urgent advancement.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) and Multimodal LLMs have shown promising capabilities for SVG processing, yet existing benchmarks suffer from limited real-world coverage, lack of complexity stratification, and fragmented evaluation paradigms. We introduce SVGenius, a comprehensive benchmark comprising 2,377 queries across three progressive dimensions: understanding, editing, and generation. Built on real-world data from 24 application domains with systematic complexity stratification, SVGenius evaluates models through 8 task categories and 18 metrics. We assess 22 mainstream models spanning different scales, architectures, training paradigms, and accessibility levels. Our analysis reveals that while proprietary models significantly outperform open-source counterparts, all models exhibit systematic performance degradation with increasing complexity, indicating fundamental limitations in current approaches; however, reasoning-enhanced training proves more effective than pure scaling for overcoming these limitations, though style transfer remains the most challenging capability across all model types. SVGenius establishes the first systematic evaluation framework for SVG processing, providing crucial insights for developing more capable vector graphics models and advancing automated graphic design applications. Appendix and supplementary materials (including all data and code) are available at https://zju-real.github.io/SVGenius.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' SVG understanding, editing, and generation capabilities
Addressing limited real-world coverage and complexity stratification in benchmarks
Evaluating model performance degradation with increasing task complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Comprehensive benchmark with 2,377 queries
Evaluates models through 8 task categories
Systematic complexity stratification for assessment
🔎 Similar Papers
No similar papers found.
S
Siqi Chen
Zhejiang University
Xinyu Dong
Xinyu Dong
Selfiie Corporation
Machine LearningLLMBioinformatics
Haolei Xu
Haolei Xu
Zhejiang University
Xingyu Wu
Xingyu Wu
Hong Kong Polytechnic University
Automated machine learningCausality-based machine learningLarge foundation modelAutoML
F
Fei Tang
Zhejiang University
H
Hang Zhang
Zhejiang University
Y
Yuchen Yan
Zhejiang University
L
Linjuan Wu
Zhejiang University
Wenqi Zhang
Wenqi Zhang
Zhejiang University
Language ModelMultimodal LearningEmbodied Agents
G
Guiyang Hou
Zhejiang University
Y
Yongliang Shen
Zhejiang University
Weiming Lu
Weiming Lu
Zhejiang University
Natural Language ProcessingLarge Language ModelsAGI
Y
Yueting Zhuang
Zhejiang University