Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains

📅 2025-01-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In high-stakes domains such as financial investment and legal Q&A, large language models (LLMs) suffer from opaque reasoning processes and lack of self-verification mechanisms, undermining interpretability and trustworthiness. To address this, we propose Selective Tree Exploration—a dynamic tree-search inference method—and introduce PROOF-Score, a novel interpretable metric for evaluating reasoning quality. We further construct domain-specific reasoning fine-tuning datasets: CoT-stock-2k and CoT-legal-2k. Our approach integrates supervised fine-tuning (SFT), domain-adapted chain-of-thought (CoT) activation, and tree-based reasoning. Experiments demonstrate significant improvements in both reasoning accuracy and explanation quality on stock recommendation and legal question-answering tasks, outperforming mainstream baselines on both metrics. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are widely applied to downstream domains. However, current LLMs for high-stakes domain tasks, such as financial investment and legal QA, typically generate brief answers without reasoning processes and explanations. This limits users' confidence in making decisions based on their responses. While original CoT shows promise, it lacks self-correction mechanisms during reasoning. This work introduces Domain$o1$s, which enhances LLMs' reasoning capabilities on domain tasks through supervised fine-tuning and tree search. We construct CoT-stock-2k and CoT-legal-2k datasets for fine-tuning models that activate domain-specific reasoning steps based on their judgment. Additionally, we propose Selective Tree Exploration to spontaneously explore solution spaces and sample optimal reasoning paths to improve performance. We also introduce PROOF-Score, a new metric for evaluating domain models' explainability, complementing traditional accuracy metrics with richer assessment dimensions. Extensive experiments on stock investment recommendation and legal reasoning QA tasks demonstrate Domaino1s's leading performance and explainability. Our code is available at https://anonymous.4open.science/r/Domaino1s-006F/.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Domain-Specific Reasoning
Explainability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domainos
Selective Tree Exploration
PROOF-Score
🔎 Similar Papers
No similar papers found.
X
Xu Chu
School of Software and Microelectronics, Peking University, Beijing, China
Z
Zhijie Tan
School of Software and Microelectronics, Peking University, Beijing, China
H
Hanlin Xue
School of Software and Microelectronics, Peking University, Beijing, China
G
Guanyu Wang
School of Software and Microelectronics, Peking University, Beijing, China
Tong Mo
Tong Mo
AI Research Engineer at Huawei Canada
Reinforcement LearningKeywork Spotting
W
Weiping Li
School of Software and Microelectronics, Peking University, Beijing, China