CycleChemist: A Dual-Pronged Machine Learning Framework for Organic Photovoltaic Discovery

📅 2025-11-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Organic photovoltaic (OPV) material discovery has long been hindered by the difficulty of co-designing donor/acceptor molecular pairs; existing approaches typically optimize individual components in isolation and lack a unified modeling framework. Method: We propose a dual-path machine learning paradigm, introducing OPV2D—the largest experimentally validated OPV dataset to date—and integrating hierarchical graph neural networks, multi-task prediction of optoelectronic properties, molecular orbital energy estimation, and a reinforcement-learning–guided MatGPT generative model for joint donor–acceptor pair generation and power conversion efficiency (PCE)-driven closed-loop optimization. Contribution/Results: Our framework significantly improves both predictive accuracy and synthetic feasibility for high-PCE (>18%) materials, establishing the first end-to-end, scalable pipeline for accelerated OPV material discovery.

Technology Category

Application Category

📝 Abstract
Organic photovoltaic (OPV) materials offer a promising path toward sustainable energy generation, but their development is limited by the difficulty of identifying high performance donor and acceptor pairs with strong power conversion efficiencies (PCEs). Existing design strategies typically focus on either the donor or the acceptor alone, rather than using a unified approach capable of modeling both components. In this work, we introduce a dual machine learning framework for OPV discovery that combines predictive modeling with generative molecular design. We present the Organic Photovoltaic Donor Acceptor Dataset (OPV2D), the largest curated dataset of its kind, containing 2000 experimentally characterized donor acceptor pairs. Using this dataset, we develop the Organic Photovoltaic Classifier (OPVC) to predict whether a material exhibits OPV behavior, and a hierarchical graph neural network that incorporates multi task learning and donor acceptor interaction modeling. This framework includes the Molecular Orbital Energy Estimator (MOE2) for predicting HOMO and LUMO energy levels, and the Photovoltaic Performance Predictor (P3) for estimating PCE. In addition, we introduce the Material Generative Pretrained Transformer (MatGPT) to produce synthetically accessible organic semiconductors, guided by a reinforcement learning strategy with three objective policy optimization. By linking molecular representation learning with performance prediction, our framework advances data driven discovery of high performance OPV materials.
Problem

Research questions and friction points this paper is trying to address.

Identifying high-performance donor-acceptor pairs for organic photovoltaics efficiently
Overcoming limitations of single-component focused design strategies in OPV development
Linking molecular representation with performance prediction for material discovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual machine learning framework combining prediction and generation
Hierarchical graph neural network with multi-task interaction modeling
Reinforcement learning guided generative transformer for molecular design
🔎 Similar Papers
No similar papers found.
Hou Hei Lam
Hou Hei Lam
Tsinghua University
AI
J
Jiangjie Qiu
Tsinghua University
Xiuyuan Hu
Xiuyuan Hu
PhD candidate at Tsinghua University
AI for ScienceMachine Learning
W
Wentao Li
Tsinghua University
F
Fankun Zeng
Tsinghua University
Siwei Fu
Siwei Fu
Zhejiang University
Visual analyticsHuman-AI collaborative decision-making
H
Hao Zhang
Tsinghua University
X
Xiaonan Wang
Tsinghua University