🤖 AI Summary
Current large language models (LLMs) exhibit significant limitations in molecular structure reasoning, particularly in leveraging critical structural features—such as functional groups—to predict molecular properties.
Method: We propose Molecular Structure Reasoning (MSR), the first framework to explicitly incorporate molecular structural sketches into LLM-based reasoning. MSR establishes a dual-path paradigm for reasoning over both known and unknown molecules, integrating SMILES and graph-based structural encodings, structure-aware prompt engineering, and a multi-stage reasoning chain to achieve interpretable, structure-to-language mapping.
Contribution/Results: Evaluated across multiple molecular property prediction and functional group identification tasks, MSR consistently achieves substantial accuracy improvements over baseline LLMs. These results empirically validate that explicit structural modeling is both effective and essential for enhancing LLMs’ chemical understanding—bridging a key gap between symbolic chemical knowledge and neural language reasoning.
📝 Abstract
Recently, large language models (LLMs) have shown significant progress, approaching human perception levels. In this work, we demonstrate that despite these advances, LLMs still struggle to reason using molecular structural information. This gap is critical because many molecular properties, including functional groups, depend heavily on such structural details. To address this limitation, we propose an approach that sketches molecular structures for reasoning. Specifically, we introduce Molecular Structural Reasoning (MSR) framework to enhance the understanding of LLMs by explicitly incorporating the key structural features. We present two frameworks for scenarios where the target molecule is known or unknown. We verify that our MSR improves molecular understanding through extensive experiments.