EDM-ARS: A Domain-Specific Multi-Agent System for Automated Educational Data Mining Research

๐Ÿ“… 2026-03-18
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the lack of domain knowledge integration and end-to-end cohesion in current automated educational data mining pipelines. To bridge this gap, we propose a multi-agent system tailored for educational research, wherein five specialized large language model (LLM) agents collaboratively orchestrate the entire scientific workflowโ€”from problem formulation and data analysis to manuscript generation. The framework innovatively embeds educational domain knowledge through a state-machine coordinator, a three-tier educational data registry, and a structured agent communication protocol. It further supports iterative revision cycles, checkpoint-based recovery, and sandboxed execution for reliability. The system autonomously produces LaTeX-formatted academic papers with verifiable machine learning analyses and authentic citations, and has been open-sourced to empower the educational research community.

Technology Category

Application Category

๐Ÿ“ Abstract
In this technical report, we present the Educational Data Mining Automated Research System (EDM-ARS), a domain-specific multi-agent pipeline that automates end-to-end educational data mining (EDM) research. We conceptualize EDM-ARS as a general framework for domain-aware automated research pipelines, where educational expertise is embedded into each stage of the research lifecycle. As a first instantiation of this framework, we focus on predictive modeling tasks. Within this scope, EDM-ARS orchestrates five specialized LLM-powered agents (ProblemFormulator, DataEngineer, Analyst, Critic, and Writer) through a state-machine coordinator that supports revision loops, checkpoint-based recovery, and sandboxed code execution. Given a research prompt and a dataset, EDM-ARS produces a complete LaTeX manuscript with real Semantic Scholar citations, validated machine learning analyses, and automated methodological peer review. We also provide a detailed description of the system architecture, the three-tier data registry design that encodes educational domain expertise, the specification of each agent, the inter-agent communication protocol, and mechanisms for error-handling and self-correction. Finally, we discuss current limitations, including single-dataset scope and formulaic paper output, and outline a phased roadmap toward causal inference, transfer learning, psychometric, and multi-dataset generalization. EDM-ARS is released as an open-source project to support the educational research community.
Problem

Research questions and friction points this paper is trying to address.

Educational Data Mining
Automated Research
Multi-Agent System
Domain-Specific Automation
Predictive Modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent system
educational data mining
automated research
LLM-powered agents
domain-specific framework
๐Ÿ”Ž Similar Papers
No similar papers found.