Can LLM-based Financial Investing Strategies Outperform the Market in Long Run?

📅 2025-05-11

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Existing LLM-driven financial timing strategies predominantly rely on short-horizon, narrow-sample backtests, rendering them vulnerable to survivorship bias and data snooping—thus overstating their out-of-sample efficacy. To address this, we propose FINSABER, the first framework to systematically evaluate the long-term robustness and generalizability of LLM-based trading strategies over a 20-year timespan and across a cross-section of >100 equities. FINSABER integrates financial text parsing, dynamic signal generation, a multi-horizon–multi-market-regime backtesting engine, and a survivorship-bias correction mechanism. Our empirical analysis reveals a systematic behavioral bias in LLM strategies—“conservative in bull markets, aggressive in bear markets”—causing consistent underperformance relative to passive benchmarks over extended horizons; their apparent short-term edge substantially decays when scaling both temporally and cross-sectionally. Consequently, we introduce a market-state-aware risk control paradigm to mitigate short-horizon overfitting, establishing a methodological foundation and empirical caution for trustworthy LLM deployment in investment practice.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have recently been leveraged for asset pricing tasks and stock trading applications, enabling AI agents to generate investment decisions from unstructured financial data. However, most evaluations of LLM timing-based investing strategies are conducted on narrow timeframes and limited stock universes, overstating effectiveness due to survivorship and data-snooping biases. We critically assess their generalizability and robustness by proposing FINSABER, a backtesting framework evaluating timing-based strategies across longer periods and a larger universe of symbols. Systematic backtests over two decades and 100+ symbols reveal that previously reported LLM advantages deteriorate significantly under broader cross-section and over a longer-term evaluation. Our market regime analysis further demonstrates that LLM strategies are overly conservative in bull markets, underperforming passive benchmarks, and overly aggressive in bear markets, incurring heavy losses. These findings highlight the need to develop LLM strategies that are able to prioritise trend detection and regime-aware risk controls over mere scaling of framework complexity.

Problem

Research questions and friction points this paper is trying to address.

Assessing long-term market outperformance of LLM-based investing strategies

Evaluating generalizability of LLM strategies across diverse market conditions

Addressing biases in narrow timeframe and limited stock evaluations

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based investment decisions from unstructured data

FINSABER framework for long-term backtesting

Trend detection and regime-aware risk controls

🔎 Similar Papers

Revolutionizing Finance with LLMs: An Overview of Applications and Insights