🤖 AI Summary
Non-stationarity in financial markets and challenges in fusing heterogeneous multimodal information severely limit the adaptability and decision accuracy of existing quantitative models. To address this, we propose DR-MoE—a Dynamic Routing Multimodal Mixture-of-Experts framework leveraging large language models. DR-MoE is the first to employ vision-language models (VLMs) for market state perception and dynamic expert routing, decoupling state recognition from strategy execution. It integrates four heterogeneous trading experts—trend-following, mean-reversion, breakout, and position-holding—each generating fine-grained sub-strategies and dynamically allocating weights in real time. Routing classification and risk-adjusted policy optimization are jointly trained via a hybrid supervised fine-tuning and reinforcement learning (SFT-RL) approach. Evaluated across equities, futures, and cryptocurrencies, DR-MoE achieves statistically significant improvements in annualized return, Sharpe ratio, and maximum drawdown over 15 state-of-the-art baselines, demonstrating superior robustness and cross-asset generalization.
📝 Abstract
The inherent non-stationarity of financial markets and the complexity of multi-modal information pose significant challenges to existing quantitative trading models. Traditional methods relying on fixed structures and unimodal data struggle to adapt to market regime shifts, while large language model (LLM)-driven solutions - despite their multi-modal comprehension - suffer from static strategies and homogeneous expert designs, lacking dynamic adjustment and fine-grained decision mechanisms. To address these limitations, we propose MM-DREX: a Multimodal-driven, Dynamically-Routed EXpert framework based on large language models. MM-DREX explicitly decouples market state perception from strategy execution to enable adaptive sequential decision-making in non-stationary environments. Specifically, it (1) introduces a vision-language model (VLM)-powered dynamic router that jointly analyzes candlestick chart patterns and long-term temporal features to allocate real-time expert weights; (2) designs four heterogeneous trading experts (trend, reversal, breakout, positioning) generating specialized fine-grained sub-strategies; and (3) proposes an SFT-RL hybrid training paradigm to synergistically optimize the router's market classification capability and experts' risk-adjusted decision-making. Extensive experiments on multi-modal datasets spanning stocks, futures, and cryptocurrencies demonstrate that MM-DREX significantly outperforms 15 baselines (including state-of-the-art financial LLMs and deep reinforcement learning models) across key metrics: total return, Sharpe ratio, and maximum drawdown, validating its robustness and generalization. Additionally, an interpretability module traces routing logic and expert behavior in real time, providing an audit trail for strategy transparency.