Generalizable and Efficient Automated Scoring with a Knowledge-Distilled Multi-Task Mixture-of-Experts

📅 2025-11-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In real-world educational settings, automated scoring systems typically require task-specific model training, incurring high computational, storage, and maintenance overhead. To address this, we propose a knowledge distillation–based Mixture-of-Experts (MoE) framework for multi-task scoring. It employs a shared encoder, a learnable gating mechanism, and lightweight task-specific heads to jointly capture both task-invariant and task-specific representations. Expert modules encapsulate reusable scoring competencies, substantially improving cross-task generalization and enabling zero-shot or few-shot adaptation to new tasks. Evaluated on nine scientific reasoning tasks, our unified model matches the performance of dedicated single-task baselines while reducing model size by 6× compared to an ensemble of independent models and achieving 87× compression relative to a 20B-parameter teacher model. This yields significant gains in efficiency and deployment feasibility.

Technology Category

Application Category

📝 Abstract
Automated scoring of written constructed responses typically relies on separate models per task, straining computational resources, storage, and maintenance in real-world education settings. We propose UniMoE-Guided, a knowledge-distilled multi-task Mixture-of-Experts (MoE) approach that transfers expertise from multiple task-specific large models (teachers) into a single compact, deployable model (student). The student combines (i) a shared encoder for cross-task representations, (ii) a gated MoE block that balances shared and task-specific processing, and (iii) lightweight task heads. Trained with both ground-truth labels and teacher guidance, the student matches strong task-specific models while being far more efficient to train, store, and deploy. Beyond efficiency, the MoE layer improves transfer and generalization: experts develop reusable skills that boost cross-task performance and enable rapid adaptation to new tasks with minimal additions and tuning. On nine NGSS-aligned science-reasoning tasks (seven for training/evaluation and two held out for adaptation), UniMoE-Guided attains performance comparable to per-task models while using $sim$6$ imes$ less storage than maintaining separate students, and $87 imes$ less than the 20B-parameter teacher. The method offers a practical path toward scalable, reliable, and resource-efficient automated scoring for classroom and large-scale assessment systems.
Problem

Research questions and friction points this paper is trying to address.

Automated scoring requires separate models per task, straining computational resources
Individual scoring models demand excessive storage and maintenance in education settings
Existing approaches lack generalization and efficient adaptation to new assessment tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge-distilled multi-task Mixture-of-Experts model
Shared encoder with gated MoE block
Lightweight task heads for efficient deployment
🔎 Similar Papers
No similar papers found.