Routing Sensitivity Without Controllability: A Diagnostic Study of Fairness in MoE Language Models

📅 2026-03-28

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This work investigates the sensitivity of Mixture-of-Experts (MoE) language models’ routing mechanisms to demographic content and their potential for improving generation fairness. The authors propose Fairness-Aware Routing Evaluation (FARE), a diagnostic framework that systematically assesses the feasibility and limitations of mitigating stereotypes through routing interventions across multiple MoE architectures. Through analyses of routing preferences, expert masking, log-likelihood–based interventions, and multidimensional generation evaluations on models including Mixtral, Qwen, DeepSeek-MoE, and OLMoE, the study reveals that while routing sensitivity to demographic cues is widespread, biases are deeply entangled with core knowledge, hindering effective fairness control. Most models fail to robustly transfer routing preferences, and even successful interventions rarely improve generative fairness—often at the cost of significant performance degradation.

Technology Category

Application Category

📝 Abstract

Mixture-of-Experts (MoE) language models are universally sensitive to demographic content at the routing level, yet exploiting this sensitivity for fairness control is structurally limited. We introduce Fairness-Aware Routing Equilibrium (FARE), a diagnostic framework designed to probe the limits of routing-level stereotype intervention across diverse MoE architectures. FARE reveals that routing-level preference shifts are either unachievable (Mixtral, Qwen1.5, Qwen3), statistically non-robust (DeepSeekMoE), or accompanied by substantial utility cost (OLMoE, -4.4%p CrowS-Pairs at -6.3%p TQA). Critically, even where log-likelihood preference shifts are robust, they do not transfer to decoded generation: expanded evaluations on both non-null models yield null results across all generation metrics. Group-level expert masking reveals why: bias and core knowledge are deeply entangled within expert groups. These findings indicate that routing sensitivity is necessary but insufficient for stereotype control, and identify specific architectural conditions that can inform the design of more controllable future MoE systems.

Problem

Research questions and friction points this paper is trying to address.

Mixture-of-Experts

fairness

routing sensitivity

stereotype control

language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts

fairness control

routing sensitivity