GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule Generation

📅 2025-04-30

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Current 3D molecular generation research widely relies on the GEOM-Drugs dataset for evaluation, yet its preprocessing suffers from severe chemical inaccuracies—including erroneous valence bond assignments, miscalculated bond orders, and inconsistent classical force-field parameterization misaligned with reference structures—compromising chemical validity of evaluation metrics. Method: We systematically diagnose and rectify these flaws by introducing a “chemical consistency–first” evaluation framework: (i) reconstructing valence rules per IUPAC standards; (ii) replacing empirical force fields with GFN2-xTB quantum-chemical geometry optimization and energy computation to establish chemically rigorous ground-truth benchmarks; and (iii) implementing a molecular topology validation and graph-rule modeling pipeline. Contribution/Results: Re-evaluating state-of-the-art generative models under this framework reveals substantial overestimation of prior performance metrics. We publicly release corrected data protocols and evaluation scripts to foster community-wide adoption of chemically sound 3D molecular generation assessment standards.

Technology Category

Application Category

📝 Abstract

Deep generative models have shown significant promise in generating valid 3D molecular structures, with the GEOM-Drugs dataset serving as a key benchmark. However, current evaluation protocols suffer from critical flaws, including incorrect valency definitions, bugs in bond order calculations, and reliance on force fields inconsistent with the reference data. In this work, we revisit GEOM-Drugs and propose a corrected evaluation framework: we identify and fix issues in data preprocessing, construct chemically accurate valency tables, and introduce a GFN2-xTB-based geometry and energy benchmark. We retrain and re-evaluate several leading models under this framework, providing updated performance metrics and practical recommendations for future benchmarking. Our results underscore the need for chemically rigorous evaluation practices in 3D molecular generation. Our recommended evaluation methods and GEOM-Drugs processing scripts are available at https://github.com/isayevlab/geom-drugs-3dgen-evaluation.

Problem

Research questions and friction points this paper is trying to address.

Fixes flawed evaluation protocols in 3D molecular generation benchmarks

Corrects valency definitions and bond order calculations in GEOM-Drugs

Introduces chemically accurate geometry and energy benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Corrected data preprocessing and valency tables

Introduced GFN2-xTB-based geometry benchmark

Retrained models with updated evaluation framework

🔎 Similar Papers

No similar papers found.