AgentSkiller: Scaling Generalist Agent Intelligence through Semantically Integrated Cross-Domain Data Synthesis

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

The development of general-purpose agents is hindered by the scarcity of high-quality, long-horizon, cross-domain interactive data. To address this challenge, this work proposes AgentSkiller, a novel framework that introduces a directed acyclic graph (DAG)-based state transition architecture and a cross-domain service integration mechanism. By leveraging domain ontologies, persona-centered entity graphs, service blueprints to define MCP tool interfaces, and persona-driven simulators, AgentSkiller automatically generates multi-turn, verifiable, and state-explicit high-fidelity interaction data. The approach ensures data recoverability and diversity, producing approximately 11K high-quality synthetic samples. Empirical evaluation demonstrates significant performance gains on function-calling tasks, with particularly pronounced improvements observed in large-scale language models.

Technology Category

Application Category

📝 Abstract

Large Language Model agents demonstrate potential in solving real-world problems via tools, yet generalist intelligence is bottlenecked by scarce high-quality, long-horizon data. Existing methods collect privacy-constrained API logs or generate scripted interactions lacking diversity, which struggle to produce data requisite for scaling capabilities. We propose AgentSkiller, a fully automated framework synthesizing multi-turn interaction data across realistic, semantically linked domains. It employs a DAG-based architecture with explicit state transitions to ensure determinism and recoverability. The pipeline builds a domain ontology and Person-Centric Entity Graph, defines tool interfaces via Service Blueprints for Model Context Protocol servers, and populates environments with consistent databases and strict Domain Policies. A cross-domain fusion mechanism links services to simulate complex tasks. Finally, the pipeline creates user tasks by verifying solution paths, filtering via execution-based validation, and generating queries using a Persona-based Simulator for automated rollout. This produces reliable environments with clear state changes. To demonstrate effectiveness, we synthesized $\approx$ 11K interaction samples; experimental results indicate that models trained on this dataset achieve significant improvements on function calling over baselines, particularly in larger parameter regimes.

Problem

Research questions and friction points this paper is trying to address.

generalist agent

data scarcity

long-horizon interaction

cross-domain synthesis

high-quality training data

Innovation

Methods, ideas, or system contributions that make the work stand out.

AgentSkiller

cross-domain data synthesis

semantic integration