🤖 AI Summary
This work addresses the challenge legal practitioners face in evaluating and improving the factual accuracy of AI-generated deposition summaries due to a lack of effective tools. To bridge this gap, the study extends the fact-based “nugget” evaluation framework—previously used only in automatic assessment—to user-facing support, presenting the first interactive prototype system designed for this purpose. The system enables two core functionalities: comparing the factual fidelity of alternative summaries and assisting users in manually refining automatically generated content. Grounded in the practical demands of legal workflows, this research not only enhances human-AI collaboration efficiency but also demonstrates the feasibility and real-world utility of the nugget-based approach in professional legal settings.
📝 Abstract
While large language models (LLMs) are increasingly used to summarize long documents, this trend poses significant challenges in the legal domain, where the factual accuracy of deposition summaries is crucial. Nugget-based methods have been shown to be extremely helpful for the automated evaluation of summarization approaches. In this work, we translate these methods to the user side and explore how nuggets could directly assist end users. Although prior systems have demonstrated the promise of nugget-based evaluation, its potential to support end users remains underexplored. Focusing on the legal domain, we present a prototype that leverages a factual nugget-based approach to support legal professionals in two concrete scenarios: (1) determining which of two summaries is better, and (2) manually improving an automatically generated summary.