🤖 AI Summary
Existing legal large language models exhibit severe deficiencies in mathematical reasoning within legal contexts, particularly lacking task-specific modeling and evaluation for real-world applications such as compensation calculation.
Method: We introduce LexNum, the first Chinese benchmark for legal mathematical reasoning, covering economic damages, work-related injury compensation, and traffic accident liability. We propose LexPam, a reinforcement learning algorithm guided by legal procedural awareness, integrating legal process modeling, multi-stage reasoning supervision, and domain-adaptive fine-tuning, optimized end-to-end via the PPO framework.
Contribution/Results: Experiments demonstrate that LexPam significantly improves mathematical reasoning accuracy across all three compensation tasks, consistently outperforming state-of-the-art legal LLMs and general-purpose reasoning models. This work fills critical gaps in both legal-domain mathematical reasoning benchmarks and methodology.
📝 Abstract
The legal mathematical reasoning ability of LLMs is crucial when applying them to real-world scenarios, as it directly affects the credibility of the LLM. While existing legal LLMs can perform general judicial question answering, their legal mathematical reasoning capabilities have not been trained. Open-domain reasoning models, though able to generate detailed calculation steps, do not follow the reasoning logic required for legal scenarios. Additionally, there is currently a lack of legal mathematical reasoning datasets to help validate and enhance LLMs' reasoning abilities in legal contexts. To address these issues, we propose the first Chinese legal Mathematical Reasoning Dataset, LexNum, which includes three common legal mathematical reasoning scenarios: economic compensation, work injury compensation, and traffic accident compensation. Based on LexNum, we tested the performance of existing legal LLMs and reasoning LLMs, and introduced LexPam, a reinforcement learning algorithm guided by legal procedural awareness to train LLMs, enhancing their mathematical reasoning abilities in legal scenarios. Experiments on tasks in the three legal scenarios show that the performance of existing legal LLMs and reasoning models in legal mathematical reasoning tasks is unsatisfactory. LexPam can enhance the LLM's ability in these tasks.