MolecularIQ: Characterizing Chemical Reasoning Capabilities Through Symbolic Verification on Molecular Graphs

๐Ÿ“… 2026-01-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the lack of systematic evaluation for symbolic, verifiable reasoning over molecular graph structures in current chemical large language models. Existing benchmarks often suffer from label bias or information leakage, hindering precise diagnosis of model shortcomings. To bridge this gap, we propose MolecularIQโ€”the first evaluation framework specifically designed for symbolic reasoning on molecular graphs. By integrating molecular graph representations, symbolic logic verification, and carefully structured reasoning tasks, MolecularIQ establishes a fine-grained benchmark that effectively uncovers systematic failure modes of contemporary models across specific molecular structures and reasoning challenges. This framework provides interpretable diagnostic insights and actionable directions for developing chemical large language models with faithful structural understanding capabilities.

Technology Category

Application Category

๐Ÿ“ Abstract
A molecule's properties are fundamentally determined by its composition and structure encoded in its molecular graph. Thus, reasoning about molecular properties requires the ability to parse and understand the molecular graph. Large Language Models (LLMs) are increasingly applied to chemistry, tackling tasks such as molecular name conversion, captioning, text-guided generation, and property or reaction prediction. Most existing benchmarks emphasize general chemical knowledge, rely on literature or surrogate labels that risk leakage or bias, or reduce evaluation to multiple-choice questions. We introduce MolecularIQ, a molecular structure reasoning benchmark focused exclusively on symbolically verifiable tasks. MolecularIQ enables fine-grained evaluation of reasoning over molecular graphs and reveals capability patterns that localize model failures to specific tasks and molecular structures. This provides actionable insights into the strengths and limitations of current chemistry LLMs and guides the development of models that reason faithfully over molecular structure.
Problem

Research questions and friction points this paper is trying to address.

molecular graph
chemical reasoning
symbolic verification
LLM evaluation
reasoning benchmark
Innovation

Methods, ideas, or system contributions that make the work stand out.

MolecularIQ
symbolic verification
molecular graph reasoning
chemistry LLMs
structure-based evaluation
๐Ÿ”Ž Similar Papers
No similar papers found.
C
Christoph Bartmann
ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria
Johannes Schimunek
Johannes Schimunek
Johannes Kepler University Linz - ELLIS Unit at the LIT AI Lab
Machine Learning
Mykyta Ielanskyi
Mykyta Ielanskyi
PhD student ELLIS unit Linz
UncertaintyLanguage ModelingChemoinformatics
Philipp Seidl
Philipp Seidl
Institute for Machine Learning, Johannes Kepler University Linz
machine learningdrug discoveryml for life sciences
G
G. Klambauer
ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria; Clinical Research Institute Medical Artificial Intelligence, Johannes Kepler University, Linz, Austria
S
Sohvi Luukkonen
ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria