π€ AI Summary
High-dimensional, low signal-to-noise ratio characteristics of financial high-frequency data impede effective alpha signal discovery. Existing deep learning, genetic programming, and large language model (LLM) approaches suffer from narrow search space coverage, opaque representations, weak economic interpretability, and poor generalization.
Method: We propose CogAlpha, a cognitive alpha mining framework that models LLMs as evolvable cognitive agents. It integrates code-based representation, multi-stage finance-oriented prompting, and evolutionary search (mutation/crossover) to enable structured, human-like exploration of alpha expressions at the code level. Iterative optimization is driven by a financial evaluation feedback loop, balancing logical consistency and creativity.
Contribution/Results: Empirical evaluation on A-share markets demonstrates that CogAlpha significantly enhances alphaβs predictive power, robustness, and cross-period generalizability. It represents the first successful synergy of LLM reasoning and evolutionary computation for quantitative factor discovery.
π Abstract
Discovering effective predictive signals, or ``alphas,'' from financial data with high dimensionality and extremely low signal-to-noise ratio remains a difficult open problem. Despite progress in deep learning, genetic programming, and, more recently, large language model (LLM)--based factor generation, existing approaches still explore only a narrow region of the vast alpha search space. Neural models tend to produce opaque and fragile patterns, while symbolic or formula-based methods often yield redundant or economically ungrounded expressions that generalize poorly. Although different in form, these paradigms share a key limitation: none can conduct broad, structured, and human-like exploration that balances logical consistency with creative leaps. To address this gap, we introduce the Cognitive Alpha Mining Framework (CogAlpha), which combines code-level alpha representation with LLM-driven reasoning and evolutionary search. Treating LLMs as adaptive cognitive agents, our framework iteratively refines, mutates, and recombines alpha candidates through multi-stage prompts and financial feedback. This synergistic design enables deeper thinking, richer structural diversity, and economically interpretable alpha discovery, while greatly expanding the effective search space. Experiments on A-share equities demonstrate that CogAlpha consistently discovers alphas with superior predictive accuracy, robustness, and generalization over existing methods. Our results highlight the promise of aligning evolutionary optimization with LLM-based reasoning for automated and explainable alpha discovery. All source code will be released.