Trading-R1: Financial Trading with LLM Reasoning via Reinforcement Learning

📅 2025-09-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The core challenge in financial AI lies in developing structured reasoning capabilities that are domain-specialized, interpretable, and risk-aware. To address this, we propose Tauric—a novel framework featuring: (1) a three-stage curriculum learning paradigm that systematically integrates market principles and strategic thinking into model training; (2) hybrid architecture synergizing large language models’ semantic understanding with reinforcement learning’s decision optimization, trained end-to-end via supervised fine-tuning and RL on our proprietary multi-source heterogeneous corpus, Tauric-TR1-DB, to map natural language inputs to robust trading actions; and (3) fine-grained volatility-aware decision-making and evidence-grounded investment rationale generation. Evaluated across six major stock/ETF assets, Tauric outperforms state-of-the-art open- and closed-source instruction-tuned models, achieving a +23.6% improvement in Sharpe ratio and a −31.4% reduction in maximum drawdown, while producing auditable, disciplined, and fully interpretable trading strategies.

Technology Category

Application Category

📝 Abstract
Developing professional, structured reasoning on par with human financial analysts and traders remains a central challenge in AI for finance, where markets demand interpretability and trust. Traditional time-series models lack explainability, while LLMs face challenges in turning natural-language analysis into disciplined, executable trades. Although reasoning LLMs have advanced in step-by-step planning and verification, their application to risk-sensitive financial decisions is underexplored. We present Trading-R1, a financially-aware model that incorporates strategic thinking and planning for comprehensive thesis composition, facts-grounded analysis, and volatility-adjusted decision making. Trading-R1 aligns reasoning with trading principles through supervised fine-tuning and reinforcement learning with a three-stage easy-to-hard curriculum. Training uses Tauric-TR1-DB, a 100k-sample corpus spanning 18 months, 14 equities, and five heterogeneous financial data sources. Evaluated on six major equities and ETFs, Trading-R1 demonstrates improved risk-adjusted returns and lower drawdowns compared to both open-source and proprietary instruction-following models as well as reasoning models. The system generates structured, evidence-based investment theses that support disciplined and interpretable trading decisions. Trading-R1 Terminal will be released at https://github.com/TauricResearch/Trading-R1.
Problem

Research questions and friction points this paper is trying to address.

Develops AI for disciplined financial trading decisions
Enhances interpretability in risk-sensitive market analysis
Bridges natural language reasoning to executable trades
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning for financial trading decisions
Three-stage curriculum training for reasoning alignment
Structured evidence-based investment thesis generation
🔎 Similar Papers
No similar papers found.
Yijia Xiao
Yijia Xiao
University of California, Los Angeles
AI for FinanceAgentsAI for ScienceMultimodal LLM
Edward Sun
Edward Sun
University of California, Los Angeles
AI for ScienceAgentsRobotics
T
Tong Chen
University of Washington
F
Fang Wu
Stanford University
D
Di Luo
University of California, Los Angeles
W
Wei Wang
University of California, Los Angeles