SkillDroid: Compile Once, Reuse Forever

📅 2026-04-16
📈 Citations: 0
Influential: 0
📄 PDF

career value

202K/year
🤖 AI Summary
This work addresses the inefficiency and unreliability of existing large language model (LLM)-driven mobile GUI agents, which stem from a lack of experience reuse mechanisms. The authors propose a three-tiered skill-based agent framework that, for the first time, compiles successful task trajectories into parameterized skill templates and enables zero-LLM-call skill reuse through a cascaded matching mechanism combining regular expression and embedding similarity. A failure-driven learning mechanism dynamically updates skills to ensure continuous performance improvement. In longitudinal evaluations over 150 rounds, the system achieves an 85.3% task success rate—23 percentage points higher than the baseline—while reducing LLM invocations by 49%. During skill reuse phases, it attains 100% success with a 2.4× speedup, and its overall success rate steadily improves from 87% to 91%.

Technology Category

Application Category

📝 Abstract
LLM-based mobile GUI agents treat every task invocation as an independent reasoning episode, requiring a full LLM inference call at each action step. This per-step dependence makes them stateless: a task completed successfully yesterday is re-derived from scratch today, with no improvement in reliability or speed. We present SkillDroid, a three-layer skill agent that compiles successful LLM-guided GUI trajectories into parameterized skill templates (sequences of UI actions with weighted element locators and typed parameter slots) and replays them on future invocations without any LLM calls. A matching cascade (regex patterns, embedding similarity, and app filtering) routes incoming instructions to stored skills, while a failure-learning layer triggers recompilation when skill reliability degrades. Over a 150-round longitudinal evaluation with systematic instruction variation and controlled perturbations, SkillDroid achieves an 85.3% success rate (23 percentage points above a stateless LLM baseline) while using 49% fewer LLM calls. The skill replay mechanism achieves a perfect 1000% success rate across 79 replay rounds at 2.4 times the speed of full LLM execution. Most critically, the system improves with use: its success rate converges upward from 87% to 91%, while the baseline degrades from 80% to 44%.
Problem

Research questions and friction points this paper is trying to address.

LLM-based GUI agents
stateless execution
skill reuse
mobile automation
trajectory compilation
Innovation

Methods, ideas, or system contributions that make the work stand out.

skill compilation
parameterized skill templates
LLM-free replay
matching cascade
failure-learning
🔎 Similar Papers
No similar papers found.