🤖 AI Summary
This study addresses the challenge of enhancing UI design efficiency, diversity, and creative quality through AI augmentation. We propose a human-AI collaborative “AI-inspired” design paradigm, systematically integrating large language models (LLMs), vision-language models (VLMs), and diffusion models (DMs) fine-tuned for UI generation. Our method comprises three technical pathways: (1) LLM-driven UI specification, generation, and iterative refinement; (2) VLM-enabled cross-modal semantic retrieval over application screenshots; and (3) high-fidelity UI image synthesis via domain-adapted DMs. To our knowledge, this is the first work to achieve organic, end-to-end synergy among these three state-of-the-art AI model classes in the UI design workflow—while preserving human designers’ creative agency. Empirical evaluation demonstrates significant gains in inspiration stimulation, iteration acceleration, and solution diversification. We deliver a production-ready workflow, rigorously delineate technical boundaries, and identify critical ethical challenges—including attribution, bias, and design autonomy.
📝 Abstract
Graphical User Interface (or simply UI) is a primary mean of interaction between users and their devices. In this paper, we discuss three complementary Artificial Intelligence (AI) approaches for triggering the creativity of app designers and inspiring them create better and more diverse UI designs. First, designers can prompt a Large Language Model (LLM) to directly generate and adjust UIs. Second, a Vision-Language Model (VLM) enables designers to effectively search a large screenshot dataset, e.g. from apps published in app stores. Third, a Diffusion Model (DM) can be trained to specifically generate UIs as inspirational images. We present an AI-inspired design process and discuss the implications and limitations of the approaches.