🤖 AI Summary
Deploying and migrating LLM-driven AI agents in edge intelligence systems faces critical challenges—including resource constraints, environmental heterogeneity, and stringent real-time requirements—particularly in dynamic edge environments. Method: This paper proposes the first adaptive deployment and migration framework tailored for such scenarios. It jointly models resource constraints and end-to-end latency costs, introduces a lightweight state migration mechanism, and innovatively integrates ant colony optimization with large language models for collaborative, distributed decision-making to enable efficient agent placement and dynamic rescheduling. Contribution/Results: Implemented as a prototype on AgentScope and empirically evaluated across globally distributed edge servers, the framework significantly reduces deployment latency and migration overhead. Under multimodal workloads, it improves resource utilization by 32.7% and ensures QoS compliance for 99.2% of tasks.
📝 Abstract
The rise of LLMs such as ChatGPT and Claude fuels the need for AI agents capable of real-time task handling. However, migrating data-intensive, multi-modal edge workloads to cloud data centers, traditionally used for agent deployment, introduces significant latency. Deploying AI agents at the edge improves efficiency and reduces latency. However, edge environments present challenges due to limited and heterogeneous resources. Maintaining QoS for mobile users necessitates agent migration, which is complicated by the complexity of AI agents coordinating LLMs, task planning, memory, and external tools. This paper presents the first systematic deployment and management solution for LLM-based AI agents in dynamic edge environments. We propose a novel adaptive framework for AI agent placement and migration in edge intelligence systems. Our approach models resource constraints and latency/cost, leveraging ant colony algorithms and LLM-based optimization for efficient decision-making. It autonomously places agents to optimize resource utilization and QoS and enables lightweight agent migration by transferring only essential state. Implemented on a distributed system using AgentScope and validated across globally distributed edge servers, our solution significantly reduces deployment latency and migration costs.