Macaron-A2UI: A Model for Generative UI in Personal Agents

📅 2026-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Static plain-text chat interfaces struggle to support personal agents in handling complex tasks due to their lack of dynamic, context-aware interaction capabilities. This work proposes the first end-to-end generative user interface (Generative UI) model that simultaneously generates natural language responses and lightweight executable UI actions in real time—without relying on explicit structural prompts—to facilitate interactive processes such as information gathering, preference refinement, and multi-objective coordination. To advance research in this direction, we introduce a large-scale Generative UI corpus and the A2UI-Bench evaluation benchmark. Leveraging efficient LoRA-based fine-tuning and reward-driven reinforcement learning, we train large language models ranging from 30B to 754B parameters. The best-performing model achieves a score of 75.6 on A2UI-Bench, substantially outperforming current state-of-the-art baselines. All models, benchmarks, and evaluation protocols are publicly released.
📝 Abstract
As personal agents evolve to handle complex, user-centric tasks, static plain-text chat is rapidly becoming a bottleneck. Generative UI emerges as the necessary new interface layer, dynamically synthesizing the right controls, options, and state from the interaction context in real time. We present Macaron-A2UI, a model for Generative UI in personal agents. Our goal is to move beyond text-only interaction by enabling agents to generate natural language together with lightweight, executable UI actions for information collection, preference refinement, confirmation, and multi-goal organization. We build a large-scale Generative UI corpus from heterogeneous dialogue sources, introduce A2UI-Bench for controlled evaluation, and train 30B, 235B and 754B models with parameter-efficient LoRA-based supervised fine-tuning followed by reward-driven reinforcement learning. The best Macaron-A2UI model reaches 75.6 overall on A2UI-Bench without explicit schema hints, surpassing the strongest full-schema frontier baseline. We release the models, benchmark, and evaluation protocol to support future work on Generative UI for personal agents.
Problem

Research questions and friction points this paper is trying to address.

Generative UI
personal agents
human-computer interaction
dynamic interface
UI generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative UI
Personal Agents
LoRA-based Fine-tuning
Reinforcement Learning
A2UI-Bench