Few-shot Molecular Property Prediction: A Survey

šŸ“… 2025-10-09
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
Molecular property annotation scarcity severely limits AI applications in drug and materials discovery, making few-shot molecular property prediction (FSMPP) a critical challenge. This paper presents the first systematic survey of FSMPP, identifying two fundamental bottlenecks: cross-property generalization—hindered by distributional shifts and weak biochemical correlations among properties—and cross-molecule generalization—undermined by high structural heterogeneity. To address these, we propose a multi-level taxonomy encompassing data, models, and learning paradigms; integrate graph neural networks, meta-learning, knowledge transfer, and chemical priors to enhance prediction robustness under extreme label scarcity; and unify mainstream benchmarks, evaluation protocols, and method performance. Our work establishes the first scalable research framework for FSMPP and delivers a clear, actionable technical roadmap—bridging foundational gaps and accelerating progress in low-data molecular AI.

Technology Category

Application Category

šŸ“ Abstract
AI-assisted molecular property prediction has become a promising technique in early-stage drug discovery and materials design in recent years. However, due to high-cost and complex wet-lab experiments, real-world molecules usually experience the issue of scarce annotations, leading to limited labeled data for effective supervised AI model learning. In light of this, few-shot molecular property prediction (FSMPP) has emerged as an expressive paradigm that enables learning from only a few labeled examples. Despite rapidly growing attention, existing FSMPP studies remain fragmented, without a coherent framework to capture methodological advances and domain-specific challenges. In this work, we present the first comprehensive and systematic survey of few-shot molecular property prediction. We begin by analyzing the few-shot phenomenon in molecular datasets and highlighting two core challenges: (1) cross-property generalization under distribution shifts, where each task corresponding to each property, may follow a different data distribution or even be inherently weakly related to others from a biochemical perspective, requiring the model to transfer knowledge across heterogeneous prediction tasks, and (2) cross-molecule generalization under structural heterogeneity, where molecules involved in different or same properties may exhibit significant structural diversity, making model difficult to achieve generalization. Then, we introduce a unified taxonomy that organizes existing methods into data, model, and learning paradigm levels, reflecting their strategies for extracting knowledge from scarce supervision in few-shot molecular property prediction. Next, we compare representative methods, summarize benchmark datasets and evaluation protocols. In the end, we identify key trends and future directions for advancing the continued research on FSMPP.
Problem

Research questions and friction points this paper is trying to address.

Addressing scarce molecular annotations in property prediction
Overcoming cross-property generalization under distribution shifts
Solving cross-molecule generalization under structural heterogeneity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Few-shot learning for molecular property prediction
Unified taxonomy organizing data model learning
Addressing cross-property cross-molecule generalization challenges
šŸ”Ž Similar Papers
No similar papers found.
Z
Zeyu Wang
Institute of Cyberspace Security, College of Information Engineering, Zhejiang University of Technology, 310023, Hangzhou, China; Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology, 310056, Hangzhou, China; School of Information and Communication Technology, Griffith University, Southport, QLD 4215, Australia
T
Tianyi Jiang
Institute of Cyberspace Security, College of Information Engineering, Zhejiang University of Technology, 310023, Hangzhou, China; Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology, 310056, Hangzhou, China
H
Huanchang Ma
School of Information Science and Technology, Northeast Normal University, Changchun, Jilin, 130117, China
Y
Yao Lu
Institute of Cyberspace Security, College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China; Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology, Hangzhou 310056, China; Centre for Frontier AI Research, Agency for Science, Technology and Research, Singapore 138632
X
Xiaoze Bao
College of Pharmaceutical Science & Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals, Zhejiang University of Technology, 310014, Hangzhou, China; Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology, 310056, Hangzhou, China
S
Shanqing Yu
Institute of Cyberspace Security, College of Information Engineering, Zhejiang University of Technology, 310023, Hangzhou, China; Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology, 310056, Hangzhou, China
Qi Xuan
Qi Xuan
Professor, Zhejiang University of Technology
AI SecuritySocial NetworkDeep LearningData Mining
Shirui Pan
Shirui Pan
Professor, ARC Future Fellow, FQA, Director of TrustAGI Lab, Griffith University
Data MiningMachine LearningGraph Neural NetworksTrustworthy AITime Series
X
Xin Zheng
School of Information and Communication Technology, Griffith University, Southport, QLD 4215, Australia