UIBenchKit: A unified toolkit for design-to-code model evaluation

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

215K/year
🤖 AI Summary
Existing design-to-code generation methods are difficult to compare fairly due to the absence of a unified evaluation protocol, hindering both research progress and practical adoption. This work addresses this gap by introducing the first standardized evaluation framework tailored to this task, accompanied by an open-source, modular toolkit that unifies environment setup, model inference, code rendering, and multi-dimensional metric computation in a plug-and-play manner. The platform not only standardizes the evaluation pipeline and enables visual analysis but also conducts systematic benchmarking of state-of-the-art approaches, revealing critical directions for improvement. By providing a reproducible and extensible infrastructure, this study establishes a foundational resource for the community to advance the field in a consistent and measurable way.
📝 Abstract
Recent years have seen substantial progress in automated design-to-code generation, with many methods proposed for generating HTML and CSS from webpage screenshots. However, the absence of a standardized evaluation platform makes it difficult to compare these methods fairly, limiting both practical adoption and systematic research progress. To bridge this gap, we introduce UIBenchKit, an open-source, integrated toolkit designed to unify the evaluation of design-to-code tasks. UIBenchKit abstracts the complexities of environment setup, model inference, and code rendering, offering researchers a plug-and-play architecture to compare various methods under consistent settings. In addition, it offers an analytical interface for comparison across multiple metrics. Using UIBenchKit, we conduct a benchmarking study of existing tools and derive several findings that highlight directions for future improvement. By providing a streamlined environment for both experimentation and evaluation, UIBenchKit aims to accelerate future benchmarking and innovations in web engineering. The evaluation platform and toolkit are available at the project page https://www.uibenchkit.com/.
Problem

Research questions and friction points this paper is trying to address.

design-to-code
evaluation platform
benchmarking
UI generation
code generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

design-to-code
evaluation benchmark
UIBenchKit
HTML/CSS generation
standardized evaluation
🔎 Similar Papers
No similar papers found.