Reinforcement Learning with Graph Attention for Routing and Wavelength Assignment with Lightpath Reuse


Flex-Rate Transpondersにより、既存のLightPathが新しいサービスに対応することができます。これは、LightPath Reuse(RWA-LR)を使用したルーティングと波長の割り当てと呼ばれるタスクです。
以前の最先端のRLアプローチを2.5%(17.4 Tbpsは追加のスループットを意味する)、最高のヒューリスティックを1.2%(8.5 Tbpsは追加のスループットを意味します)よりも優れています。
この限界ゲインは、Long Horizo​​nリソース割り当てタスクで効果的なRLポリシーを学習することの難しさを強調しています。


Many works have investigated reinforcement learning (RL) for routing and spectrum assignment on flex-grid networks but only one work to date has examined RL for fixed-grid with flex-rate transponders, despite production systems using this paradigm. Flex-rate transponders allow existing lightpaths to accommodate new services, a task we term routing and wavelength assignment with lightpath reuse (RWA-LR). We re-examine this problem and present a thorough benchmarking of heuristic algorithms for RWA-LR, which are shown to have 6% increased throughput when candidate paths are ordered by number of hops, rather than total length. We train an RL agent for RWA-LR with graph attention networks for the policy and value functions to exploit the graph-structured data. We provide details of our methodology and open source all of our code for reproduction. We outperform the previous state-of-the-art RL approach by 2.5% (17.4 Tbps mean additional throughput) and the best heuristic by 1.2% (8.5 Tbps mean additional throughput). This marginal gain highlights the difficulty in learning effective RL policies on long horizon resource allocation tasks.


著者 Michael Doherty,Alejandra Beghelli
発行日 2025-02-20 17:10:11+00:00
