Cognitive Alpha Mining via LLM-Driven Code-Based Evolution

📅 2025-11-24

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

High-dimensional, low signal-to-noise ratio characteristics of financial high-frequency data impede effective alpha signal discovery. Existing deep learning, genetic programming, and large language model (LLM) approaches suffer from narrow search space coverage, opaque representations, weak economic interpretability, and poor generalization. Method: We propose CogAlpha, a cognitive alpha mining framework that models LLMs as evolvable cognitive agents. It integrates code-based representation, multi-stage finance-oriented prompting, and evolutionary search (mutation/crossover) to enable structured, human-like exploration of alpha expressions at the code level. Iterative optimization is driven by a financial evaluation feedback loop, balancing logical consistency and creativity. Contribution/Results: Empirical evaluation on A-share markets demonstrates that CogAlpha significantly enhances alpha’s predictive power, robustness, and cross-period generalizability. It represents the first successful synergy of LLM reasoning and evolutionary computation for quantitative factor discovery.

Technology Category

Application Category

📝 Abstract

Discovering effective predictive signals, or ``alphas,'' from financial data with high dimensionality and extremely low signal-to-noise ratio remains a difficult open problem. Despite progress in deep learning, genetic programming, and, more recently, large language model (LLM)--based factor generation, existing approaches still explore only a narrow region of the vast alpha search space. Neural models tend to produce opaque and fragile patterns, while symbolic or formula-based methods often yield redundant or economically ungrounded expressions that generalize poorly. Although different in form, these paradigms share a key limitation: none can conduct broad, structured, and human-like exploration that balances logical consistency with creative leaps. To address this gap, we introduce the Cognitive Alpha Mining Framework (CogAlpha), which combines code-level alpha representation with LLM-driven reasoning and evolutionary search. Treating LLMs as adaptive cognitive agents, our framework iteratively refines, mutates, and recombines alpha candidates through multi-stage prompts and financial feedback. This synergistic design enables deeper thinking, richer structural diversity, and economically interpretable alpha discovery, while greatly expanding the effective search space. Experiments on A-share equities demonstrate that CogAlpha consistently discovers alphas with superior predictive accuracy, robustness, and generalization over existing methods. Our results highlight the promise of aligning evolutionary optimization with LLM-based reasoning for automated and explainable alpha discovery. All source code will be released.

Problem

Research questions and friction points this paper is trying to address.

Discovering predictive signals from noisy financial data

Expanding narrow search space in alpha discovery methods

Generating interpretable alphas through code-based evolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven reasoning with code-based evolution

Multi-stage prompts and financial feedback refinement

Synergistic design for interpretable alpha discovery

🔎 Similar Papers

EPiC: Cost-effective Search-based Prompt Engineering of LLMs for Code Generation