Accelerating High-Efficiency Organic Photovoltaic Discovery via Pretrained Graph Neural Networks and Generative Reinforcement Learning

📅 2025-03-31

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Designing high-performance donor–acceptor (D–A) molecular pairs for organic photovoltaics (OPVs) remains challenging due to the vast chemical space and complex structure–property relationships. Method: This work introduces the first integrated framework combining large-scale graph neural network (GNN) pretraining with GPT-2–driven reinforcement learning (RL) for de novo molecular generation. We propose a novel synergistic paradigm integrating GNN-based representation learning and generative RL, augmented by interpretable fragment-level structural attribution analysis. Contribution/Results: We release the largest open-source OPV dataset to date—comprising nearly 3,000 experimentally validated D–A pairs—and generate molecules predicted to achieve power conversion efficiencies (PCEs) of ~21%. The framework yields synthetically feasible, high-performance candidates alongside actionable design principles, significantly accelerating molecular discovery and enabling closed-loop, AI-driven OPV materials development and experimental validation.

Technology Category

Application Category

📝 Abstract

Organic photovoltaic (OPV) materials offer a promising avenue toward cost-effective solar energy utilization. However, optimizing donor-acceptor (D-A) combinations to achieve high power conversion efficiency (PCE) remains a significant challenge. In this work, we propose a framework that integrates large-scale pretraining of graph neural networks (GNNs) with a GPT-2 (Generative Pretrained Transformer 2)-based reinforcement learning (RL) strategy to design OPV molecules with potentially high PCE. This approach produces candidate molecules with predicted efficiencies approaching 21%, although further experimental validation is required. Moreover, we conducted a preliminary fragment-level analysis to identify structural motifs recognized by the RL model that may contribute to enhanced PCE, thus providing design guidelines for the broader research community. To facilitate continued discovery, we are building the largest open-source OPV dataset to date, expected to include nearly 3,000 donor-acceptor pairs. Finally, we discuss plans to collaborate with experimental teams on synthesizing and characterizing AI-designed molecules, which will provide new data to refine and improve our predictive and generative models.

Problem

Research questions and friction points this paper is trying to address.

Optimizing donor-acceptor combinations for high OPV efficiency

Designing high-PCE OPV molecules using GNNs and RL

Identifying structural motifs for improved photovoltaic performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pretrained GNNs for OPV molecule design

GPT-2 based reinforcement learning strategy

Largest open-source OPV dataset creation

🔎 Similar Papers

Genetic-guided GFlowNets for Sample Efficient Molecular Optimization