DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts

📅 2025-07-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing approaches to concept drift in non-stationary data streams struggle to balance efficiency and specialization due to reliance on coarse-grained adaptation or simplistic ensembles. Method: This paper proposes an online Mixture-of-Experts (MoE) architecture featuring a symbiotic learning loop between a compact neural router and an incrementally updated pool of Hoeffding Tree experts. A multi-hot correctness mask provides precise supervision to the router, enabling rapid expert specialization; a dynamic expert selection strategy further enhances response accuracy. Results: Evaluated on nine benchmark data streams—including abrupt and gradual drifts as well as real-world scenarios—the method matches state-of-the-art adaptive ensemble methods in predictive performance while achieving significantly higher resource efficiency and faster adaptation speed.

Technology Category

Application Category

📝 Abstract
Learning from non-stationary data streams subject to concept drift requires models that can adapt on-the-fly while remaining resource-efficient. Existing adaptive ensemble methods often rely on coarse-grained adaptation mechanisms or simple voting schemes that fail to optimally leverage specialized knowledge. This paper introduces DriftMoE, an online Mixture-of-Experts (MoE) architecture that addresses these limitations through a novel co-training framework. DriftMoE features a compact neural router that is co-trained alongside a pool of incremental Hoeffding tree experts. The key innovation lies in a symbiotic learning loop that enables expert specialization: the router selects the most suitable expert for prediction, the relevant experts update incrementally with the true label, and the router refines its parameters using a multi-hot correctness mask that reinforces every accurate expert. This feedback loop provides the router with a clear training signal while accelerating expert specialization. We evaluate DriftMoE's performance across nine state-of-the-art data stream learning benchmarks spanning abrupt, gradual, and real-world drifts testing two distinct configurations: one where experts specialize on data regimes (multi-class variant), and another where they focus on single-class specialization (task-based variant). Our results demonstrate that DriftMoE achieves competitive results with state-of-the-art stream learning adaptive ensembles, offering a principled and efficient approach to concept drift adaptation. All code, data pipelines, and reproducibility scripts are available in our public GitHub repository: https://github.com/miguel-ceadar/drift-moe.
Problem

Research questions and friction points this paper is trying to address.

Adapting models to non-stationary data streams efficiently
Overcoming coarse-grained adaptation in ensemble methods
Enabling expert specialization for concept drift handling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Online Mixture-of-Experts with neural router
Co-training framework for expert specialization
Multi-hot correctness mask for router refinement
🔎 Similar Papers
No similar papers found.
M
Miguel Aspis
Ireland’s National Centre for Artificial Intelligence (CeADAR), University College Dublin, Belfield, Dublin, D04 V2N9
S
Sebastián A. Cajas Ordónez
Ireland’s National Centre for Artificial Intelligence (CeADAR), University College Dublin, Belfield, Dublin, D04 V2N9
Andrés L. Suárez-Cetrulo
Andrés L. Suárez-Cetrulo
CeADAR: Ireland's Centre for Artificial Intelligence – University College Dublin
machine learningdata streamsconcept driftstock trend predictionbig data analytics
R
Ricardo Simón Carbajo
Ireland’s National Centre for Artificial Intelligence (CeADAR), University College Dublin, Belfield, Dublin, D04 V2N9