DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts

📅 2025-07-24

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Existing approaches to concept drift in non-stationary data streams struggle to balance efficiency and specialization due to reliance on coarse-grained adaptation or simplistic ensembles. Method: This paper proposes an online Mixture-of-Experts (MoE) architecture featuring a symbiotic learning loop between a compact neural router and an incrementally updated pool of Hoeffding Tree experts. A multi-hot correctness mask provides precise supervision to the router, enabling rapid expert specialization; a dynamic expert selection strategy further enhances response accuracy. Results: Evaluated on nine benchmark data streams—including abrupt and gradual drifts as well as real-world scenarios—the method matches state-of-the-art adaptive ensemble methods in predictive performance while achieving significantly higher resource efficiency and faster adaptation speed.

Technology Category

Application Category

📝 Abstract

Learning from non-stationary data streams subject to concept drift requires models that can adapt on-the-fly while remaining resource-efficient. Existing adaptive ensemble methods often rely on coarse-grained adaptation mechanisms or simple voting schemes that fail to optimally leverage specialized knowledge. This paper introduces DriftMoE, an online Mixture-of-Experts (MoE) architecture that addresses these limitations through a novel co-training framework. DriftMoE features a compact neural router that is co-trained alongside a pool of incremental Hoeffding tree experts. The key innovation lies in a symbiotic learning loop that enables expert specialization: the router selects the most suitable expert for prediction, the relevant experts update incrementally with the true label, and the router refines its parameters using a multi-hot correctness mask that reinforces every accurate expert. This feedback loop provides the router with a clear training signal while accelerating expert specialization. We evaluate DriftMoE's performance across nine state-of-the-art data stream learning benchmarks spanning abrupt, gradual, and real-world drifts testing two distinct configurations: one where experts specialize on data regimes (multi-class variant), and another where they focus on single-class specialization (task-based variant). Our results demonstrate that DriftMoE achieves competitive results with state-of-the-art stream learning adaptive ensembles, offering a principled and efficient approach to concept drift adaptation. All code, data pipelines, and reproducibility scripts are available in our public GitHub repository: https://github.com/miguel-ceadar/drift-moe.

Problem

Research questions and friction points this paper is trying to address.

Adapting models to non-stationary data streams efficiently

Overcoming coarse-grained adaptation in ensemble methods

Enabling expert specialization for concept drift handling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online Mixture-of-Experts with neural router

Co-training framework for expert specialization

Multi-hot correctness mask for router refinement

🔎 Similar Papers

Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time