AppellateGen: A Benchmark for Appellate Legal Judgment Generation

📅 2026-01-04

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Existing research on legal judgment generation has predominantly focused on first-instance trials, overlooking the complex dialectical reasoning required in appellate proceedings that involve evaluating initial judgments alongside new evidence. This work addresses this gap by introducing AppellateGen, the first benchmark dataset for appellate judgment generation, comprising 7,351 paired case records. We further propose a Standard Operating Procedure (SOP)-driven Legal Multi-Agent System (SLMAS) that decomposes judgment generation into distinct stages—issue identification, evidence retrieval, and opinion drafting—explicitly modeling the causal dependencies among judicial phases. Experimental results demonstrate that SLMAS significantly enhances the logical consistency of generated judgments, though large language models still face notable challenges in handling the intricate reasoning demands of appellate adjudication. The dataset and code are publicly released.

Technology Category

Application Category

📝 Abstract

Legal judgment generation is a critical task in legal intelligence. However, existing research in legal judgment generation has predominantly focused on first-instance trials, relying on static fact-to-verdict mappings while neglecting the dialectical nature of appellate (second-instance) review. To address this, we introduce AppellateGen, a benchmark for second-instance legal judgment generation comprising 7,351 case pairs. The task requires models to draft legally binding judgments by reasoning over the initial verdict and evidentiary updates, thereby modeling the causal dependency between trial stages. We further propose a judicial Standard Operating Procedure (SOP)-based Legal Multi-Agent System (SLMAS) to simulate judicial workflows, which decomposes the generation process into discrete stages of issue identification, retrieval, and drafting. Experimental results indicate that while SLMAS improves logical consistency, the complexity of appellate reasoning remains a substantial challenge for current LLMs. The dataset and code are publicly available at: https://anonymous.4open.science/r/AppellateGen-5763.

Problem

Research questions and friction points this paper is trying to address.

legal judgment generation

appellate review

second-instance trials

dialectical reasoning

judicial workflow

Innovation

Methods, ideas, or system contributions that make the work stand out.

Appellate Judgment Generation

Legal Multi-Agent System

Judicial SOP