Reinforcement Learning with Graph Attention for Routing and Wavelength Assignment with Lightpath Reuse

📅 2025-02-20

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the Routing and Wavelength Assignment with Flexible-Rate Transceivers (RWA-LR) in fixed-grid optical networks, focusing on the critical scenario of multiplexing new traffic onto existing lightpaths under long-horizon resource allocation—where reinforcement learning (RL) policies suffer from training instability and sample inefficiency. Method: We innovatively integrate Graph Attention Networks (GATs) into an Actor-Critic RL framework to enable topology-aware modeling of both policy and value functions. Additionally, we propose a hop-count–based heuristic for candidate path generation, demonstrating that topological distance is more decisive than physical link length for RWA-LR decisions. Contribution/Results: Evaluated within an open-source, reproducible training pipeline, our approach achieves a 2.5% throughput gain (+17.4 Tbps) over the best RL baseline and outperforms the strongest heuristic by 1.2% (+8.5 Tbps), empirically validating the efficacy of graph-structured representation learning for RWA-LR optimization.

Technology Category

Application Category

📝 Abstract

Many works have investigated reinforcement learning (RL) for routing and spectrum assignment on flex-grid networks but only one work to date has examined RL for fixed-grid with flex-rate transponders, despite production systems using this paradigm. Flex-rate transponders allow existing lightpaths to accommodate new services, a task we term routing and wavelength assignment with lightpath reuse (RWA-LR). We re-examine this problem and present a thorough benchmarking of heuristic algorithms for RWA-LR, which are shown to have 6% increased throughput when candidate paths are ordered by number of hops, rather than total length. We train an RL agent for RWA-LR with graph attention networks for the policy and value functions to exploit the graph-structured data. We provide details of our methodology and open source all of our code for reproduction. We outperform the previous state-of-the-art RL approach by 2.5% (17.4 Tbps mean additional throughput) and the best heuristic by 1.2% (8.5 Tbps mean additional throughput). This marginal gain highlights the difficulty in learning effective RL policies on long horizon resource allocation tasks.

Problem

Research questions and friction points this paper is trying to address.

Routing and wavelength assignment with lightpath reuse

Reinforcement learning with graph attention networks

Benchmarking heuristic algorithms for increased throughput

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning with Graph Attention

Routing and Wavelength Assignment

Lightpath Reuse Optimization

🔎 Similar Papers

Graph Attention Reinforcement Learning for Multicast Routing and Age-Optimal Scheduling