AtomComposer: Discovering Chemical Space from First Principles with Reinforcement Learning

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing molecular generation models are constrained by pretraining data and struggle to effectively explore unknown chemical space. This work proposes a self-guided reinforcement learning agent that operates without pretraining and autonomously constructs stable, novel three-dimensional isomers under given stoichiometric constraints. By integrating online reinforcement learning, multi-composition joint training, physics-based energy evaluation, and geometric validity constraints, the method achieves, for the first time, generalizable molecular generation across diverse chemical compositions. It avoids overfitting to individual formulas and demonstrates nearly an order-of-magnitude increase in the number of valid isomers discovered for unseen compositions compared to existing single-composition reinforcement learning baselines, substantially enhancing the efficiency of generalization and exploration in uncharted chemical space.
📝 Abstract
Discovering novel stable molecules without training data remains a grand scientific challenge. Current molecular generative models are trained on large, pre-curated datasets, which introduce biases and limit exploration of novel chemistry. In contrast, we propose a new paradigm: autonomous, generalized agents capable of mapping vast, unknown chemical spaces without any pretraining. For the first time, we present AtomComposer, a self-guided agent that autonomously constructs valid 3D isomers under stoichiometric constraints and is trained exclusively online using reinforcement learning. Unlike existing approaches that generally overfit to a specific chemical formula, we establish a multi-composition training scheme that enables a broad generalization across diverse chemistry, guided by energy- and validity-based rewards. Our agent can discover up to an order of magnitude more valid isomers on unseen test formulas than existing single-composition reinforcement-learning baselines trained with per-step energy rewards. These results fulfill the promise of online reinforcement learning as a powerful paradigm for scalable, from-scratch exploration of chemical configuration space.
Problem

Research questions and friction points this paper is trying to address.

molecular discovery
chemical space exploration
data-free generation
novel stable molecules
reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement learning
molecular generation
first-principles discovery
multi-composition generalization
3D isomer construction
B
Bjarke Hastrup
Dept. of Energy Conversion and Storage, Technical University of Denmark, Denmark
F
Francois Cornet
Dept. of Energy Conversion and Storage, Technical University of Denmark, Denmark; Dept. of Applied Mathematics and Computer Science, Technical University of Denmark, Denmark
Tejs Vegge
Tejs Vegge
Professor, Technical University of Denmark
Director of CAPeX - Pioneer Center for Accelerating P2X Materials Discovery
A
Arghya Bhowmik
Dept. of Energy Conversion and Storage, Technical University of Denmark, Denmark; Pioneer Center for Accelerating P2X Materials Discovery (CAPeX), Kgs. Lyngby, Denmark