🤖 AI Summary
This study addresses the semantic gap between AI’s computational capabilities and human intent expression in biomolecular design, tackling challenges including natural language–protein/small-molecule semantic alignment, multimodal data integration, and high domain-expertise barriers. To this end, we introduce the first instruction-aligned large language model specifically designed for biomolecules, proposing an any-to-any cross-modal instruction alignment paradigm that enables bidirectional mapping among natural language, linear sequences (FASTA/SMILES), and 3D structural representations. Our architecture integrates graph neural networks, multimodal adapters, and instruction-tuning frameworks to support end-to-end generation of functional enzymes and drug molecules. Experimentally, the designed enzymes achieve an ESP Score of 70.4—surpassing the clinically significant threshold of 60.0 for the first time—while generated drug candidates exhibit a 10% improvement in binding affinity. This work establishes a foundational framework for intention-driven, multimodal biomolecular engineering.
📝 Abstract
Understanding and designing biomolecules, such as proteins and small molecules, is central to advancing drug discovery, synthetic biology, and enzyme engineering. Recent breakthroughs in Artificial Intelligence (AI) have revolutionized biomolecular research, achieving remarkable accuracy in biomolecular prediction and design. However, a critical gap remains between AI's computational power and researchers' intuition, using natural language to align molecular complexity with human intentions. Large Language Models (LLMs) have shown potential to interpret human intentions, yet their application to biomolecular research remains nascent due to challenges including specialized knowledge requirements, multimodal data integration, and semantic alignment between natural language and biomolecules. To address these limitations, we present InstructBioMol, a novel LLM designed to bridge natural language and biomolecules through a comprehensive any-to-any alignment of natural language, molecules, and proteins. This model can integrate multimodal biomolecules as input, and enable researchers to articulate design goals in natural language, providing biomolecular outputs that meet precise biological needs. Experimental results demonstrate InstructBioMol can understand and design biomolecules following human instructions. Notably, it can generate drug molecules with a 10% improvement in binding affinity and design enzymes that achieve an ESP Score of 70.4, making it the only method to surpass the enzyme-substrate interaction threshold of 60.0 recommended by the ESP developer. This highlights its potential to transform real-world biomolecular research.