🤖 AI Summary
To address the critical bottleneck of scarce interactive legal scenario data hindering the development of large language models’ legal intelligence, this paper proposes MASER: a multi-agent simulation framework that generates legally consistent, multi-role collaborative synthetic data. MASER incorporates a case-source-driven behavioral calibration supervision module to ensure high-fidelity, verifiable, and scalable synthetic data production. Concurrently, we design MILE—a multi-stage dynamic evaluation benchmark—to systematically assess model capabilities in legal question answering, role reasoning, and procedural reasoning within interactive settings. Our key innovations include the first agent-based role modeling infused with domain-specific legal knowledge and a case-constrained behavioral alignment mechanism. Experimental results demonstrate that models trained on MASER-generated data achieve a 37.2% improvement in interaction plausibility and attain 91.4% logical consistency—significantly advancing interactive legal AI performance.
📝 Abstract
Large Language Models (LLMs) have significantly advanced legal intelligence, but the scarcity of scenario data impedes the progress toward interactive legal scenarios. This paper introduces a Multi-agent Legal Simulation Driver (MASER) to scalably generate synthetic data by simulating interactive legal scenarios. Leveraging real-legal case sources, MASER ensures the consistency of legal attributes between participants and introduces a supervisory mechanism to align participants' characters and behaviors as well as addressing distractions. A Multi-stage Interactive Legal Evaluation (MILE) benchmark is further constructed to evaluate LLMs' performance in dynamic legal scenarios. Extensive experiments confirm the effectiveness of our framework.