🤖 AI Summary
To address model update latency and staleness caused by network congestion in asynchronous distributed reinforcement learning, this paper proposes an in-line acceleration architecture leveraging programmable data planes. The method introduces three key innovations: (1) a dynamic queue mechanism supporting in-network model update aggregation; (2) the Age-of-Model (AoM) metric to quantify update freshness, integrated with in-network feedback to ensure global fairness and responsiveness; and (3) lightweight transport control combined with formal verification to minimize redundant traffic. Experimental evaluation demonstrates that the architecture significantly reduces model staleness and network congestion, achieving up to a 2.3× speedup in convergence rate and substantially outperforming baseline approaches in training efficiency.
📝 Abstract
Asynchronous Distributed Reinforcement Learning (DRL) can suffer from degraded convergence when model updates become stale, often the result of network congestion and packet loss during large-scale training. This work introduces a network data-plane acceleration architecture that mitigates such staleness by enabling inline processing of DRL model updates as they traverse the accelerator engine. To this end, we design and prototype a novel queueing mechanism that opportunistically combines compatible updates sharing a network element, reducing redundant traffic and preserving update utility. Complementing this we provide a lightweight transmission control mechanism at the worker nodes that is guided by feedback from the in-network accelerator. To assess model utility at line rate, we introduce the Age-of-Model (AoM) metric as a proxy for staleness and verify global fairness and responsiveness properties using a formal verification method. Our evaluations demonstrate that this architecture significantly reduces update staleness and congestion, ultimately improving the convergence rate in asynchronous DRL workloads.