FactorEngine: A Program-level Knowledge-Infused Factor Mining Framework for Quantitative Investment

πŸ“… 2026-03-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenges of executability, auditability, and computational efficiency in automated alpha factor mining for quantitative investing, as well as the limited expressiveness or poor interpretability of existing approaches. To overcome these issues, the authors propose FactorEngine, a novel framework that models factors as Turing-complete programs. FactorEngine decouples logical refinement from parameter optimization and integrates large language model–guided directional search with Bayesian hyperparameter optimization. It further introduces a knowledge-injected bootstrapping module that leverages a multi-agent closed-loop pipeline to transform unstructured financial reports into executable factor programs, building an empirical knowledge base to support trajectory-aware iterative refinement. Backtesting on real-world OHLCV data demonstrates that the proposed method significantly outperforms baseline models in terms of IC/ICIR, Rank IC/ICIR, and annualized return/Sharpe ratio, achieving both high predictive stability and superior investment performance.

Technology Category

Application Category

πŸ“ Abstract
We study alpha factor mining, the automated discovery of predictive signals from noisy, non-stationary market data-under a practical requirement that mined factors be directly executable and auditable, and that the discovery process remain computationally tractable at scale. Existing symbolic approaches are limited by bounded expressiveness, while neural forecasters often trade interpretability for performance and remain vulnerable to regime shifts and overfitting. We introduce FactorEngine (FE), a program-level factor discovery framework that casts factors as Turing-complete code and improves both effectiveness and efficiency via three separations: (i) logic revision vs. parameter optimization, (ii) LLM-guided directional search vs. Bayesian hyperparameter search, and (iii) LLM usage vs. local computation. FE further incorporates a knowledge-infused bootstrapping module that transforms unstructured financial reports into executable factor programs through a closed-loop multi-agent extraction-verification-code-generation pipeline, and an experience knowledge base that supports trajectory-aware refinement (including learning from failures). Across extensive backtests on real-world OHLCV data, FE produces factors with substantially stronger predictive stability and portfolio impact-for example, higher IC/ICIR (and Rank IC/ICIR) and improved AR/Sharpe, than baseline methods, achieving state-of-the-art predictive and portfolio performance.
Problem

Research questions and friction points this paper is trying to address.

alpha factor mining
quantitative investment
predictive signals
non-stationary market data
executable factors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Factor Mining
Program-level Representation
Knowledge-infused Framework
LLM-guided Search
Turing-complete Alpha Factors
πŸ”Ž Similar Papers
No similar papers found.
Q
Qinhong Lin
Beijing University of Posts and Telecommunications
R
Ruitao Feng
Beijing Value Simplex Technology Co. Ltd.
Y
Yinglun Feng
Beijing University of Posts and Telecommunications
Z
Zhenxin Huang
Yangtze Delta Research Institute, University of Electronic Science and Technology of China
Yukun Chen
Yukun Chen
Pieces Technologies Inc.
Natural Language Processing
Zhongliang Yang
Zhongliang Yang
Associate Professor, Beijing University of Posts and Telecommunications
AI SecurityFinTech
L
Linna Zhou
Beijing University of Posts and Telecommunications
B
Binjie Fei
Beijing Value Simplex Technology Co. Ltd.
J
Jiaqi Liu
Beijing Value Simplex Technology Co. Ltd.
Y
Yu Li
Beijing Value Simplex Technology Co. Ltd.