Few-shot Molecular Property Prediction: A Survey

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Molecular property annotation scarcity severely limits AI applications in drug and materials discovery, making few-shot molecular property prediction (FSMPP) a critical challenge. This paper presents the first systematic survey of FSMPP, identifying two fundamental bottlenecks: cross-property generalization—hindered by distributional shifts and weak biochemical correlations among properties—and cross-molecule generalization—undermined by high structural heterogeneity. To address these, we propose a multi-level taxonomy encompassing data, models, and learning paradigms; integrate graph neural networks, meta-learning, knowledge transfer, and chemical priors to enhance prediction robustness under extreme label scarcity; and unify mainstream benchmarks, evaluation protocols, and method performance. Our work establishes the first scalable research framework for FSMPP and delivers a clear, actionable technical roadmap—bridging foundational gaps and accelerating progress in low-data molecular AI.

Technology Category

Application Category

📝 Abstract

AI-assisted molecular property prediction has become a promising technique in early-stage drug discovery and materials design in recent years. However, due to high-cost and complex wet-lab experiments, real-world molecules usually experience the issue of scarce annotations, leading to limited labeled data for effective supervised AI model learning. In light of this, few-shot molecular property prediction (FSMPP) has emerged as an expressive paradigm that enables learning from only a few labeled examples. Despite rapidly growing attention, existing FSMPP studies remain fragmented, without a coherent framework to capture methodological advances and domain-specific challenges. In this work, we present the first comprehensive and systematic survey of few-shot molecular property prediction. We begin by analyzing the few-shot phenomenon in molecular datasets and highlighting two core challenges: (1) cross-property generalization under distribution shifts, where each task corresponding to each property, may follow a different data distribution or even be inherently weakly related to others from a biochemical perspective, requiring the model to transfer knowledge across heterogeneous prediction tasks, and (2) cross-molecule generalization under structural heterogeneity, where molecules involved in different or same properties may exhibit significant structural diversity, making model difficult to achieve generalization. Then, we introduce a unified taxonomy that organizes existing methods into data, model, and learning paradigm levels, reflecting their strategies for extracting knowledge from scarce supervision in few-shot molecular property prediction. Next, we compare representative methods, summarize benchmark datasets and evaluation protocols. In the end, we identify key trends and future directions for advancing the continued research on FSMPP.

Problem

Research questions and friction points this paper is trying to address.

Addressing scarce molecular annotations in property prediction

Overcoming cross-property generalization under distribution shifts

Solving cross-molecule generalization under structural heterogeneity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Few-shot learning for molecular property prediction

Unified taxonomy organizing data model learning

Addressing cross-property cross-molecule generalization challenges

🔎 Similar Papers

No similar papers found.

Authors to Follow