π€ AI Summary
To address the sensitivity of SMILES representations and training instability in discrete generative adversarial networks (GANs) for molecular generation, this paper proposes RL-MolGAN: a decoder-first, Transformer-based discrete GAN framework supporting both *de novo* and scaffold-guided molecular design. We introduce the first deep integration of proximal policy optimization (PPO) and Monte Carlo tree search (MCTS) into discrete GAN training, and further develop RL-MolWGANβa variant incorporating the Wasserstein distance and mini-batch discrimination to enhance convergence stability. Evaluated on QM9 and ZINC, RL-MolGAN achieves 98.7% syntactic validity for generated molecules, significantly outperforming baselines in uniqueness and quantitative estimate of drug-likeness (QED). Molecular diversity improves by 32%, and optimization of key physicochemical properties converges 2.1Γ faster.
π Abstract
Generating molecules with desired chemical properties presents a critical challenge in fields such as chemical synthesis and drug discovery. Recent advancements in artificial intelligence (AI) and deep learning have significantly contributed to data-driven molecular generation. However, challenges persist due to the inherent sensitivity of simplified molecular input line entry system (SMILES) representations and the difficulties in applying generative adversarial networks (GANs) to discrete data. This study introduces RL-MolGAN, a novel Transformer-based discrete GAN framework designed to address these challenges. Unlike traditional Transformer architectures, RL-MolGAN utilizes a first-decoder-then-encoder structure, facilitating the generation of drug-like molecules from both $de~novo$ and scaffold-based designs. In addition, RL-MolGAN integrates reinforcement learning (RL) and Monte Carlo tree search (MCTS) techniques to enhance the stability of GAN training and optimize the chemical properties of the generated molecules. To further improve the model's performance, RL-MolWGAN, an extension of RL-MolGAN, incorporates Wasserstein distance and mini-batch discrimination, which together enhance the stability of the GAN. Experimental results on two widely used molecular datasets, QM9 and ZINC, validate the effectiveness of our models in generating high-quality molecular structures with diverse and desirable chemical properties.