Stable-MoE: Lyapunov-based Token Routing for Distributed Mixture-of-Experts Training over Edge Networks

📅 2025-12-07

📈 Citations: 0

✨ Influential: 0

career value

279K/year

🤖 AI Summary

In edge networks, distributed Mixture-of-Experts (MoE) training suffers from resource heterogeneity and dynamic token arrivals, leading to queue buildup, low efficiency, and system instability. To address this, we propose the first online token routing framework grounded in Lyapunov optimization. Our method jointly optimizes token routing decisions and computational frequency allocation without requiring prior knowledge of future system states, thereby ensuring long-term system stability. We introduce two key innovations: (i) a gating consistency constraint to maintain expert selection coherence across time, and (ii) an energy-aware queue management mechanism. Both are integrated into a Lyapunov drift minimization objective for real-time, adaptive control. Extensive experiments on SVHN and CIFAR-100 demonstrate that our framework achieves ≥40% higher system throughput and improves test accuracy by over 5 percentage points compared to conventional routing mechanisms, while significantly enhancing training stability and energy efficiency.

Technology Category

Application Category

📝 Abstract

The sparse activation mechanism of mixture of experts (MoE) model empowers edge intelligence with enhanced training efficiency and reduced computational resource consumption. However, traditional token routing in distributed MoE training faces significant challenges in resource-constrained edge networks characterized by heterogeneous computing capabilities and stochastic token arrivals, which inevitably suffer from workload backlog, resource inefficiency, and performance degradation. To address this issue, we propose a novel Lyapunov-based token routing framework for distributed MoE training over resource-heterogeneous edge networks, termed Stable-MoE. Specifically, we formulate a stochastic optimization problem to maximize both system throughput and gating consistency via optimizing the token routing strategy and computational resource allocation, while ensuring long-term stability of both token and energy queues at the edge devices. Using the Lyapunov optimization, we transform the intractable long-term optimization problem into tractable per-slot subproblems by enabling online decision-making of token routing and computation frequency utilization without the knowledge of future system states. Experimental results on the SVHN and CIFAR-100 datasets demonstrate that Stable-MoE outperforms the baselines with at least 40% and 5% gains in system throughput and test accuracy, respectively.

Problem

Research questions and friction points this paper is trying to address.

Optimizes token routing for distributed MoE training in edge networks

Addresses workload backlog and resource inefficiency from heterogeneous computing

Ensures long-term stability of token and energy queues at devices

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lyapunov-based token routing for distributed MoE training

Online optimization of routing and resource allocation per time slot

Ensures queue stability and improves throughput and accuracy

🔎 Similar Papers

Expert-Token Resonance MoE: Bidirectional Routing with Efficiency Affinity-Driven Active Selection