Domain-Specialized Object Detection via Model-Level Mixtures of Experts

📅 2026-04-20

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the limited generalization of existing object detection methods in cross-domain scenarios and the lack of interpretability and effective fusion mechanisms in conventional ensemble approaches. It proposes the first model-level Mixture-of-Experts (MoE) architecture for object detection, built upon YOLO, where multiple expert detectors are trained on semantically non-overlapping subsets and dynamically fused via a learnable gating network to enable domain-specialized detection. To mitigate expert collapse, a tailored loss-balancing strategy is introduced. Experimental results on the BDD100K dataset demonstrate that the proposed method significantly outperforms traditional ensembles, achieving superior cross-domain detection performance while offering strong interpretability by revealing each expert’s domain-specific expertise.

Technology Category

Application Category

📝 Abstract

Mixture-of-Experts (MoE) models provide a structured approach to combining specialized neural networks and offer greater interpretability than conventional ensembles. While MoEs have been successfully applied to image classification and semantic segmentation, their use in object detection remains limited due to challenges in merging dense and structured predictions. In this work, we investigate model-level mixtures of object detectors and analyze their suitability for improving performance and interpretability in object detection. We propose an MoE architecture that combines YOLO-based detectors trained on semantically disjoint data subsets, with a learned gating network that dynamically weights expert contributions. We study different strategies for fusing detection outputs and for training the gating mechanism, including balancing losses to prevent expert collapse. Experiments on the BDD100K dataset demonstrate that the proposed MoE consistently outperforms standard ensemble approaches and provides insights into expert specialization across domains, highlighting model-level MoEs as a viable alternative to traditional ensembling for object detection. Our code is available at https://github.com/KASTEL-MobilityLab/mixtures-of-experts/.

Problem

Research questions and friction points this paper is trying to address.

Object Detection

Mixture of Experts

Model-Level Fusion

Domain Specialization

Expert Collapse

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts

object detection

model-level fusion