Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment Benchmarking

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing large language models (LLMs) lack rigorous real-world validation for mutual fund investment, and mainstream historical backtesting benchmarks suffer from future information leakage—leading to inflated performance estimates. Method: We propose DeepFund, the first real-time, information-leakage-proof multi-agent evaluation benchmark specifically designed for fund investment. It enforces strict temporal isolation and integrates live market data only after pretraining cutoff, eliminating “time-travel” vulnerabilities. The framework comprises LLM-driven modules for ticker analysis, decision-making, portfolio management, and risk control. Contribution/Results: Empirical evaluation across nine state-of-the-art models—including DeepSeek-V3 and Claude-3.7-Sonnet—reveals consistent net losses in live trading. This is the first systematic demonstration of LLMs’ widespread failure in realistic fund management, debunking the performance illusion induced by flawed backtests and confirming that current LLMs lack robust capability for active fund management.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated notable capabilities across financial tasks, including financial report summarization, earnings call transcript analysis, and asset classification. However, their real-world effectiveness in managing complex fund investment remains inadequately assessed. A fundamental limitation of existing benchmarks for evaluating LLM-driven trading strategies is their reliance on historical back-testing, inadvertently enabling LLMs to"time travel"-leveraging future information embedded in their training corpora, thus resulting in possible information leakage and overly optimistic performance estimates. To address this issue, we introduce DeepFund, a live fund benchmark tool designed to rigorously evaluate LLM in real-time market conditions. Utilizing a multi-agent architecture, DeepFund connects directly with real-time stock market data-specifically data published after each model pretraining cutoff-to ensure fair and leakage-free evaluations. Empirical tests on nine flagship LLMs from leading global institutions across multiple investment dimensions-including ticker-level analysis, investment decision-making, portfolio management, and risk control-reveal significant practical challenges. Notably, even cutting-edge models such as DeepSeek-V3 and Claude-3.7-Sonnet incur net trading losses within DeepFund real-time evaluation environment, underscoring the present limitations of LLMs for active fund management. Our code is available at https://github.com/HKUSTDial/DeepFund.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' real-world effectiveness in complex fund investment
Eliminating historical back-testing biases in LLM-driven trading strategies
Evaluating LLMs' performance in real-time market conditions without data leakage
Innovation

Methods, ideas, or system contributions that make the work stand out.

DeepFund enables real-time fund benchmarking
Multi-agent architecture prevents data leakage
Direct connection to live market data
🔎 Similar Papers
No similar papers found.
Changlun Li
Changlun Li
PhD Student, HKUST(GZ)
Data CleaningGood AI for DataFinTech
Y
Yao Shi
The Hong Kong University of Science and Technology (Guangzhou)
C
Chen Wang
The Hong Kong University of Science and Technology (Guangzhou)
Q
Qiqi Duan
The Hong Kong University of Science and Technology (Guangzhou)
R
Runke Ruan
The Hong Kong University of Science and Technology (Guangzhou)
W
Weijie Huang
The Hong Kong University of Science and Technology (Guangzhou)
H
Haonan Long
The Hong Kong University of Science and Technology (Guangzhou)
L
Lijun Huang
The Hong Kong University of Science and Technology (Guangzhou)
Yuyu Luo
Yuyu Luo
Assistant Professor, HKUST(GZ) / HKUST
Data AgentsLLM AgentsDatabaseText-to-SQLData-centric AI
Nan Tang
Nan Tang
National Institute of Biological Sciences, Beijing
stem cell biologyaginglung diseases