Toward Better Geometric Representations for Molecule Generative Models

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the limitations of existing molecular generation models, where geometric representation spaces derived from pretrained encoders are non-smooth and underutilized, hindering both efficiency and quality. To overcome this, the authors propose the LENSes framework, which introduces, for the first time, a node-level representation alignment (REPA) objective during generative training. By integrating multi-level representation heads with a molecule-aware loss, LENSes optimizes 3D molecular generation based on the UniMol encoder. The approach substantially smooths the representation space and enhances semantic consistency, establishing a novel pretraining paradigm for molecular encoders. Evaluated on GEOM-DRUG, the method achieves 97.28% validity and 98.51% stability, reduces the Lipschitz constant by 4.6×, and demonstrates superior representation quality on QM9 probe tasks.

📝 Abstract

Geometric representation-conditioned molecule generation provides an effective paradigm that decouples molecule representation modeling from structure generation. By decoupling molecule generation into two stages-first generating a meaningful molecule representation, and then generating a 3D molecule conditioned on this representation-the efficiency and quality of the generation process can be significantly enhanced. However, its effectiveness is fundamentally limited by the quality of the representation space: pretrained molecular encoders, such as UniMol, produce representations that are non-smooth and not fully exploited during the generative training process. In this work, we propose LENSEs, a framework that better exploits the potential of molecule representations in representation-conditioned generation methods. In particular, LENSEs introduces three complementary mechanisms: (1) a representation head, simultaneously trained during generative tasks, that extracts multi-level representations from the pretrained encoder; (2) a molecule perceptual loss that optimizes the generator in a semantic-informative representation space; and (3) a node-level representation alignment (REPA) loss that explicitly aligns the generator's hidden states with encoder representations, reducing the semantic gap between pretraining and generation. We demonstrate the effectiveness of these improvements through extensive molecule generation tasks. Specifically, on the challenging molecule generation dataset GEOM-DRUG, LENSEs achieves 97.28% validity and 98.51% molecule stability, surpassing existing advanced methods. Further analyses through Lipschitz constant reduction (4.6x) and QM9 probing tasks also demonstrate the smoother, more informative refined representations, establishing generative training with alignment objectives as a potential pretraining paradigm for molecular encoders.

Problem

Research questions and friction points this paper is trying to address.

geometric representation

molecule generative models

representation space

pretrained molecular encoders

representation-conditioned generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

representation-conditioned generation

molecular representation

perceptual loss