MG2FlowNet: Accelerating High-Reward Sample Generation via Enhanced MCTS and Greediness Control

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing GFlowNets suffer from inefficient exploration in large-scale, sparse, high-reward search spaces, struggling to stably generate high-quality samples while preserving diversity. To address this, we propose a novel generative framework integrating enhanced Monte Carlo Tree Search (MCTS) with a controllable greedy mechanism. Specifically, we incorporate the PUCT algorithm to dynamically balance exploration and exploitation, thereby improving policy evaluation accuracy; concurrently, we introduce an adjustable greediness parameter that adaptively intensifies focus on high-reward regions during sampling. Crucially, this design preserves distributional diversity while significantly increasing the frequency of high-reward sample generation and accelerating convergence. Experiments demonstrate that our framework accelerates discovery of high-reward regions and consistently produces high-quality, structured samples—outperforming baseline methods. It establishes a more robust and efficient extension paradigm for GFlowNets in complex combinatorial generation tasks.

Technology Category

Application Category

📝 Abstract
Generative Flow Networks (GFlowNets) have emerged as a powerful tool for generating diverse and high-reward structured objects by learning to sample from a distribution proportional to a given reward function. Unlike conventional reinforcement learning (RL) approaches that prioritize optimization of a single trajectory, GFlowNets seek to balance diversity and reward by modeling the entire trajectory distribution. This capability makes them especially suitable for domains such as molecular design and combinatorial optimization. However, existing GFlowNets sampling strategies tend to overexplore and struggle to consistently generate high-reward samples, particularly in large search spaces with sparse high-reward regions. Therefore, improving the probability of generating high-reward samples without sacrificing diversity remains a key challenge under this premise. In this work, we integrate an enhanced Monte Carlo Tree Search (MCTS) into the GFlowNets sampling process, using MCTS-based policy evaluation to guide the generation toward high-reward trajectories and Polynomial Upper Confidence Trees (PUCT) to balance exploration and exploitation adaptively, and we introduce a controllable mechanism to regulate the degree of greediness. Our method enhances exploitation without sacrificing diversity by dynamically balancing exploration and reward-driven guidance. The experimental results show that our method can not only accelerate the speed of discovering high-reward regions but also continuously generate high-reward samples, while preserving the diversity of the generative distribution. All implementations are available at https://github.com/ZRNB/MG2FlowNet.
Problem

Research questions and friction points this paper is trying to address.

Improving high-reward sample generation in GFlowNets
Balancing exploration and exploitation in large search spaces
Enhancing reward-driven guidance without sacrificing diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced MCTS guides sampling towards high-reward trajectories
Polynomial UCT balances exploration and exploitation adaptively
Controllable greediness mechanism regulates exploitation without sacrificing diversity
🔎 Similar Papers
No similar papers found.
R
Rui Zhu
University of Science and Technology of China, Hefei 230026, China
X
Xuan Yu
University of Science and Technology of China, Hefei 230026, China
Yudong Zhang
Yudong Zhang
University of Leicester, HFWLA/FIET/FEAI/FBCS/SMIEEE/SMACM/DSACM, Clarivate Highly Cited Researcher
artificial intelligencedeep learningmedical image processing
C
Chen Zhang
University of Science and Technology of China, Hefei 230026, China
X
Xu Wang
University of Science and Technology of China, Hefei 230026, China
Y
Yang Wang
University of Science and Technology of China, Hefei 230026, China