LCMP: Distributed Long-Haul Cost-Aware Multi-Path Routing for Inter-Datacenter RDMA Networks

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance limitations of existing routing schemes in inter-datacenter RDMA networks, which suffer from path asymmetry, delayed congestion signals, and routing conflicts among concurrent flows. To overcome these challenges, the paper proposes LCMP, a novel framework that integrates a unified path-quality scoring mechanism in the control plane with compact congestion signaling in the data plane, enabling low-cost, low-latency, and congestion-aware intelligent multipath scheduling. LCMP further incorporates cost-aware filtering and a diversity-preserving hashing mechanism to mitigate decision conflicts among concurrent flows and enhance path utilization efficiency. Evaluated on eight real-world datacenter testbeds, LCMP reduces median and tail flow completion times by 76% and 64%, respectively, compared to state-of-the-art solutions. Large-scale simulations further confirm its substantial advantages in long-distance scenarios.
📝 Abstract
RDMA-empowered cloud services are gradually deployed across datacenters (DCs) with multiple paths, which exhibit new properties of path asymmetry, delayed congestion signals, and simultaneous flow routing collisions, and further fail existing routing methods. We present LCMP, a distributed long-haul cost-aware multi-path routing framework that aims to place RDMA flows on multiple inter-DC paths, achieving low-cost, low-latency, and congestion-responsive transmission. LCMP combines a control-plane path-quality score with compact on-switch congestion signals, where the former unifies quality assessment for asymmetric paths and the latter enables responsive reaction to path congestion. LCMP further resolves the simultaneous flow decision collision problem by filtering high-cost candidates, and performing a diversity-preserving hash inside the reduced set. On an 8-DC testbed, LCMP reduces median and tail FCT slowdown by up to 76% and 64%, respectively compared to state-of-the-art (SOTA) DCN routing strategies. And large-scale NS-3 simulations under the 2000 km inter-DC scenario confirm similar improvements.
Problem

Research questions and friction points this paper is trying to address.

RDMA
inter-datacenter
multi-path routing
path asymmetry
congestion signals
Innovation

Methods, ideas, or system contributions that make the work stand out.

LCMP
RDMA
multi-path routing
inter-datacenter networks
congestion-aware
🔎 Similar Papers
No similar papers found.