$ ext{M}^{2}$LLM: Multi-view Molecular Representation Learning with Large Language Models

📅 2025-08-12

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Existing molecular representation methods—such as fingerprints and graph neural networks (GNNs)—lack semantic understanding and contextual reasoning capabilities, limiting their accuracy in predicting complex molecular properties. To address this, we propose a multi-view dynamic fusion framework that systematically integrates three complementary perspectives: molecular structure, task objective, and chemical rules—marking the first such unified approach. Leveraging the prior knowledge and logical reasoning capacity of large language models (LLMs), our method generates high-order, context-aware molecular representations. Specifically, cross-modal embedding encoding and dynamically weighted fusion enable synergistic modeling of structural features, task-directed signals, and chemically grounded constraints. Evaluated on multiple molecular property classification and regression benchmarks, our framework achieves state-of-the-art performance, significantly improving both predictive accuracy and generalization. These results validate the efficacy and scalability of LLM-driven, semantics-enriched molecular representation learning.

Technology Category

Application Category

📝 Abstract

Accurate molecular property prediction is a critical challenge with wide-ranging applications in chemistry, materials science, and drug discovery. Molecular representation methods, including fingerprints and graph neural networks (GNNs), achieve state-of-the-art results by effectively deriving features from molecular structures. However, these methods often overlook decades of accumulated semantic and contextual knowledge. Recent advancements in large language models (LLMs) demonstrate remarkable reasoning abilities and prior knowledge across scientific domains, leading us to hypothesize that LLMs can generate rich molecular representations when guided to reason in multiple perspectives. To address these gaps, we propose $ ext{M}^{2}$LLM, a multi-view framework that integrates three perspectives: the molecular structure view, the molecular task view, and the molecular rules view. These views are fused dynamically to adapt to task requirements, and experiments demonstrate that $ ext{M}^{2}$LLM achieves state-of-the-art performance on multiple benchmarks across classification and regression tasks. Moreover, we demonstrate that representation derived from LLM achieves exceptional performance by leveraging two core functionalities: the generation of molecular embeddings through their encoding capabilities and the curation of molecular features through advanced reasoning processes.

Problem

Research questions and friction points this paper is trying to address.

Improving molecular property prediction accuracy

Integrating multi-view molecular representation learning

Leveraging LLMs for molecular feature reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-view molecular representation learning framework

Dynamic fusion of structure, task, and rules views

LLM-based molecular embeddings and feature reasoning

🔎 Similar Papers

3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization