SkillGrad: Optimizing Agent Skills Like Gradient Descent

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

Existing agent skills are often limited by unreliability, incompleteness, or obsolescence, and lack explicit optimization mechanisms. This work proposes SkillGrad, a novel framework that introduces gradient-based optimization into skill evolution for the first time. SkillGrad models skill sets as structured, optimizable parameters and derives loss signals from task execution trajectories. Leveraging large language models, it performs automatic diagnosis and contrastive error analysis to generate “textual gradients” that guide skill refinement. The framework further incorporates momentum memory and hierarchical editing mechanisms to enable stable and efficient training-free skill optimization. Evaluated on SpreadsheetBench Verified and WikiTableQuestions, SkillGrad outperforms the strongest trainable baseline by an average of 6.7 percentage points. Ablation studies confirm the critical contributions of momentum and contrastive diagnosis to its effectiveness.

📝 Abstract

Agent skills provide a lightweight way to adapt LLM agents to specialized domains by storing reusable procedural knowledge in structured files. However, whether downloaded from third parties or self-generated, these skills are often unreliable, incomplete, or outdated. Existing skill-evolution methods often address these deficiencies through heuristic reflections without an explicit optimization formulation. In this paper, we propose SkillGrad, a gradient-descent-inspired framework for optimizing agent skills. SkillGrad treats the skill package as a structured parameter to optimize in a gradient descent fashion: task executions provide trajectory-level loss evidence, automatic diagnoses then provide text-based gradients that indicate the correction directions. To stabilize optimization across iterations, a momentum agent accumulates recurring diagnostic patterns into a persistent memory overlay. Finally, an LLM-based patcher executes the parameter update by applying layer-aware edits to the skill package. Evaluated on SpreadsheetBench Verified and WikiTableQuestions, SkillGrad consistently outperforms training-based skill evolution baselines across two backbone LLMs, improving over the strongest training-based baseline by $6.7$ percentage points on average. Ablations further show that momentum and contrastive diagnosis both contribute to the final skill quality.

Problem

Research questions and friction points this paper is trying to address.

agent skills

skill optimization

LLM agents

gradient descent

skill evolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Skill Optimization

Gradient Descent Analogy

LLM Agents