🤖 AI Summary
This work addresses the challenges of deploying large-scale Mixture-of-Experts (MoE) language models in low Earth orbit (LEO) satellite networks, where onboard resource constraints and high communication latency hinder performance. To tackle these issues, the authors propose Space-XNet, a novel framework that co-designs the MoE architecture with the satellite network topology. Space-XNet introduces ring-based subnet partitioning and a two-stage expert placement strategy: first allocating MoE layers to subnets along orbital directions, then optimizing intra-subnet satellite-to-expert mapping based on expert activation probabilities, all while incorporating a routing delay-aware mechanism for deep integration. Large-scale simulations with a thousand satellites demonstrate that Space-XNet reduces end-to-end inference latency by at least threefold compared to random and ablation baselines.
📝 Abstract
Leveraging continuous solar energy harvesting at high efficiency, space data centers are envisioned as a promising platform for executing energy-intensive large language models (LLMs). Recognizing this advantage, space and AI conglomerates (e.g., SpaceX, Google) are actively investing in this vision. One key challenge, however, is the efficient distributed deployment of a large-scale LLM in a satellite network due to the limited onboard computing and communication resources. This gives rise to a placement problem that involves partitioning and mapping model components to satellites such that the fundamentally different model architecture and network topology can be reconciled to ensure low-latency token generation. To address this problem, we present the Space Network of Experts (Space-XNet) framework targeting the distributed execution of a popular mixture-of-experts (MoE) model in space. The proposed placement strategies are two-level: (1) layer placement, which assigns MoE layers to satellite subnets; and (2) intra-layer expert placement, which assigns individual experts to satellites associated with the same layer/subnet. For layer placement, we exploit the ring-like communication pattern of autoregressive inference to partition the satellite constellation along the orbiting direction into subnets arranged on a ring, each hosting one MoE layer. Based on this architecture, we formulate and solve an optimization problem for intra-layer expert placement to map experts with heterogeneous activation probabilities onto satellites. The derived strategy reveals an intuitive principle: a frequently activated expert should be mapped to a satellite on a routing path with low expected latency. Experiments over a thousand-satellite constellation show that Space-XNet achieves at least a threefold latency reduction compared with conventional random and ablation-based placement strategies.