BOE-XSUM: Extreme Summarization in Clear Language of Spanish Legal Decrees and Notifications

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Spanish legal texts—particularly BOE (Boletín Oficial del Estado) decrees and notices—lack accessible, concise summaries, exacerbating information overload for non-expert readers. Method: We introduce BOE-XSUM, the first extreme summarization dataset for Spanish official legal documents, comprising 3,648 human-written plain-language summaries. We fine-tune medium-scale models—including BERTIN and GPT-J 6B—in a supervised setting and compare them against zero-shot baselines. Summary accuracy is evaluated using exact-match metrics. Contribution/Results: BOE-XSUM fills a critical gap in Spanish legal extreme summarization. Fine-tuned models substantially outperform zero-shot generation, achieving a best-case accuracy of 41.6%—a 24-percentage-point improvement—demonstrating that domain-specific data coupled with lightweight fine-tuning significantly enhances the generation of comprehensible, legally faithful summaries.

Technology Category

Application Category

📝 Abstract

The ability to summarize long documents succinctly is increasingly important in daily life due to information overload, yet there is a notable lack of such summaries for Spanish documents in general, and in the legal domain in particular. In this work, we present BOE-XSUM, a curated dataset comprising 3,648 concise, plain-language summaries of documents sourced from Spain's ``Boletín Oficial del Estado'' (BOE), the State Official Gazette. Each entry in the dataset includes a short summary, the original text, and its document type label. We evaluate the performance of medium-sized large language models (LLMs) fine-tuned on BOE-XSUM, comparing them to general-purpose generative models in a zero-shot setting. Results show that fine-tuned models significantly outperform their non-specialized counterparts. Notably, the best-performing model -- BERTIN GPT-J 6B (32-bit precision) -- achieves a 24% performance gain over the top zero-shot model, DeepSeek-R1 (accuracies of 41.6% vs. 33.5%).

Problem

Research questions and friction points this paper is trying to address.

Summarizing Spanish legal documents in plain language

Addressing lack of concise summaries for Spanish texts

Evaluating fine-tuned LLMs on legal document summarization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned LLMs on curated Spanish legal dataset

BERTIN GPT-J 6B achieved 24% performance gain

Dataset includes plain-language summaries of legal documents

🔎 Similar Papers

No similar papers found.