AlphaSAGE: Structure-Aware Alpha Mining via GFlowNets for Robust Exploration

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated alpha signal discovery in quantitative finance faces three key challenges: sparse rewards impeding efficient exploration; sequential modeling of mathematical expressions failing to capture structural semantics; and conventional reinforcement learning (RL) optimizing for a single optimal solution, thereby hindering the generation of uncorrelated alpha combinations. To address these, we propose a structure-aware GFlowNet framework. First, we design a relational graph convolutional network (RGCN)-based structural encoder that explicitly models syntactic and semantic structures of financial formulas. Second, we replace RL policies with generative flow networks (GFlowNets), enabling parallel, multi-path generation of diverse alphas. Third, we introduce a dense, multi-dimensional reward function jointly optimizing predictive power, diversity, and novelty. Experiments demonstrate significant improvements over state-of-the-art baselines in alpha quality, diversity, and pairwise decorrelation—establishing a scalable, interpretable paradigm for automated alpha discovery.

Technology Category

Application Category

📝 Abstract
The automated mining of predictive signals, or alphas, is a central challenge in quantitative finance. While Reinforcement Learning (RL) has emerged as a promising paradigm for generating formulaic alphas, existing frameworks are fundamentally hampered by a triad of interconnected issues. First, they suffer from reward sparsity, where meaningful feedback is only available upon the completion of a full formula, leading to inefficient and unstable exploration. Second, they rely on semantically inadequate sequential representations of mathematical expressions, failing to capture the structure that determine an alpha's behavior. Third, the standard RL objective of maximizing expected returns inherently drives policies towards a single optimal mode, directly contradicting the practical need for a diverse portfolio of non-correlated alphas. To overcome these challenges, we introduce AlphaSAGE (Structure-Aware Alpha Mining via Generative Flow Networks for Robust Exploration), a novel framework is built upon three cornerstone innovations: (1) a structure-aware encoder based on Relational Graph Convolutional Network (RGCN); (2) a new framework with Generative Flow Networks (GFlowNets); and (3) a dense, multi-faceted reward structure. Empirical results demonstrate that AlphaSAGE outperforms existing baselines in mining a more diverse, novel, and highly predictive portfolio of alphas, thereby proposing a new paradigm for automated alpha mining. Our code is available at https://github.com/BerkinChen/AlphaSAGE.
Problem

Research questions and friction points this paper is trying to address.

Addresses reward sparsity in automated alpha mining for quantitative finance
Overcomes inadequate sequential representations of mathematical expressions
Solves the lack of diversity in non-correlated alpha portfolios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structure-aware encoder using Relational Graph Convolutional Network
Generative Flow Networks framework for automated alpha mining
Dense multi-faceted reward structure for robust exploration
🔎 Similar Papers
No similar papers found.
B
Binqi Chen
School of Computer Science, National Key Laboratory for Multimedia Information Processing, PKU-Anker LLM Lab, Peking University, China
H
Hongjun Ding
Baruch College, City University of New York
Ning Shen
Ning Shen
Statistics, University of British Columbia
Jinsheng Huang
Jinsheng Huang
Peking University
Multimodal LearningFintech
Taian Guo
Taian Guo
Peking university
LLM for financetime series forecastingquantitative trading
L
Luchen Liu
Zhengren Quant, Beijin, China
M
Ming Zhang
School of Computer Science, National Key Laboratory for Multimedia Information Processing, PKU-Anker LLM Lab, Peking University, China