PCEvo: Path-Consistent Molecular Representation via Virtual Evolutionary

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of few-shot molecular property prediction, where limited labeled data hinders model generalization. The authors propose a representation learning approach grounded in virtual evolutionary paths: by leveraging topological dependency constraints, they enumerate chemically valid molecular editing trajectories that transform one molecule into another. Endpoint labels are then decomposed into stepwise supervision signals along these paths. To enhance robustness, a path consistency objective is introduced, enforcing agreement in predictions across different editing paths for the same molecular pair. Evaluated on QM9 and MoleculeNet benchmarks, the method significantly improves few-shot generalization and reduces prediction error compared to existing approaches.

Technology Category

Application Category

📝 Abstract
Molecular representation learning aims to learn vector embeddings that capture molecular structure and geometry, thereby enabling property prediction and downstream scientific applications. In many AI for science tasks, labeled data are expensive to obtain and therefore limited in availability. Under the few-shot setting, models trained with scarce supervision often learn brittle structure-property relationships, resulting in substantially higher prediction errors and reduced generalization to unseen molecules. To address this limitation, we propose PCEvo, a path-consistent representation method that learns from virtual paths through dynamic structural evolution. PCEvo enumerates multiple chemically feasible edit paths between retrieved similar molecular pairs under topological dependency constraints. It transforms the labels of the two molecules into stepwise supervision along each virtual evolutionary path. It introduces a path-consistency objective that enforces prediction invariance across alternative paths connecting the same two molecules. Comprehensive experiments on the QM9 and MoleculeNet datasets demonstrate that PCEvo substantially improves the few-shot generalization performance of baseline methods. The code is available at https://anonymous.4open.science/r/PCEvo-4BF2.
Problem

Research questions and friction points this paper is trying to address.

molecular representation learning
few-shot learning
data scarcity
generalization
structure-property relationship
Innovation

Methods, ideas, or system contributions that make the work stand out.

path-consistent representation
virtual evolutionary paths
few-shot molecular learning
structure-property generalization
topological dependency constraints
🔎 Similar Papers
No similar papers found.