🤖 AI Summary
Existing federated learning approaches suffer from gradient conflicts and unstable global optimization in open-ended, privacy-constrained scenarios where large language model (LLM) agents must self-evolve across heterogeneous environments.
Method: We propose a decentralized federated evolution framework featuring a “local evolution–global aggregation” paradigm: (i) lightweight local policy evolution via parameter-efficient fine-tuning (PEFT); (ii) trajectory-level high-return sample filtering to enhance policy quality; and (iii) a novel low-rank subspace gradient aggregation mechanism that decouples environment-specific dynamics and mitigates negative transfer.
Contribution/Results: Experiments across five heterogeneous environments demonstrate an average 18% improvement in task success rate over state-of-the-art federated baselines. To our knowledge, this is the first work achieving robust, privacy-preserving cross-environment self-evolution of LLM agents.
📝 Abstract
LLM agents are widely deployed in complex interactive tasks, yet privacy constraints often preclude centralized optimization and co-evolution across dynamic environments. While Federated Learning (FL) has proven effective on static datasets, its extension to the open-ended self-evolution of agents remains underexplored. Directly applying standard FL is challenging: heterogeneous tasks and sparse, trajectory-level rewards introduce severe gradient conflicts, destabilizing the global optimization process. To bridge this gap, we propose Fed-SE, a Federated Self-Evolution framework for LLM agents. Fed-SE establishes a local evolution-global aggregation paradigm. Locally, agents employ parameter-efficient fine-tuning on filtered, high-return trajectories to achieve stable gradient updates. Globally, Fed-SE aggregates updates within a low-rank subspace that disentangles environment-specific dynamics, effectively reducing negative transfer across clients. Experiments across five heterogeneous environments demonstrate that Fed-SE improves average task success rates by approximately 18% over federated baselines, validating its effectiveness in robust cross-environment knowledge transfer in privacy-constrained deployments.