π€ AI Summary
Federated learning (FL) systems suffer from poor robustness and complex design due to data heterogeneity and system constraints, necessitating automated construction methods. Method: We propose AgentFL, the first end-to-end FL system auto-synthesis framework, built upon a multi-agent collaborative architecture that integrates human-in-the-loop planning, modular supervised code generation, and sandboxed closed-loop simulation optimization. We introduce a novel three-stage collaborative development pipeline and release AgentFL-Benchβthe first benchmark platform dedicated to FL system generation. Results: Extensive experiments across 16 heterogeneous tasks demonstrate that AgentFL-generated systems match or surpass manually designed baselines in performance, while significantly improving development efficiency, generalizability, and robustness of FL systems.
π Abstract
Federated Learning (FL) offers a powerful paradigm for training models on decentralized data, but its promise is often undermined by the immense complexity of designing and deploying robust systems. The need to select, combine, and tune strategies for multifaceted challenges like data heterogeneity and system constraints has become a critical bottleneck, resulting in brittle, bespoke solutions. To address this, we introduce Helmsman, a novel multi-agent system that automates the end-to-end synthesis of federated learning systems from high-level user specifications. It emulates a principled research and development workflow through three collaborative phases: (1) interactive human-in-the-loop planning to formulate a sound research plan, (2) modular code generation by supervised agent teams, and (3) a closed-loop of autonomous evaluation and refinement in a sandboxed simulation environment. To facilitate rigorous evaluation, we also introduce AgentFL-Bench, a new benchmark comprising 16 diverse tasks designed to assess the system-level generation capabilities of agentic systems in FL. Extensive experiments demonstrate that our approach generates solutions competitive with, and often superior to, established hand-crafted baselines. Our work represents a significant step towards the automated engineering of complex decentralized AI systems.