🤖 AI Summary
This study addresses the challenge of inferring deep-sea cold seep successional stages, which is hindered by the high operational costs of manned submersibles and extremely limited microbial samples (n=13, p=26), leading to severe overfitting in data-driven models. To overcome this, we propose the first few-shot classification framework that integrates an ecological knowledge graph, leveraging macrofauna–microbiota coupling relationships and microbial co-occurrence networks to construct structural priors that guide stage inference based solely on microbial abundances. Our method employs graph-regularized multinomial logistic regression (GRMLR) with manifold penalties, incorporating macro–micro associations during training while requiring no macrofaunal observations at inference time, thereby ensuring biologically consistent and robust classification. Experiments demonstrate that our approach significantly outperforms standard baselines, achieving both interpretability and scalability even under extreme data scarcity.
📝 Abstract
Deep-sea cold seep stage assessment has traditionally relied on costly, high-risk manned submersible operations and visual surveys of macrofauna. Although microbial communities provide a promising and more cost-effective alternative, reliable inference remains challenging because the available deep-sea dataset is extremely small ($n = 13$) relative to the microbial feature dimension ($p = 26$), making purely data-driven models highly prone to overfitting. To address this, we propose a knowledge-enhanced classification framework that incorporates an ecological knowledge graph as a structural prior. By fusing macro-microbe coupling and microbial co-occurrence patterns, the framework internalizes established ecological logic into a \underline{\textbf{G}}raph-\underline{\textbf{R}}egularized \underline{\textbf{M}}ultinomial \underline{\textbf{L}}ogistic \underline{\textbf{R}}egression (GRMLR) model, effectively constraining the feature space through a manifold penalty to ensure biologically consistent classification. Importantly, the framework removes the need for macrofauna observations at inference time: macro-microbe associations are used only to guide training, whereas prediction relies solely on microbial abundance profiles. Experimental results demonstrate that our approach significantly outperforms standard baselines, highlighting its potential as a robust and scalable framework for deep-sea ecological assessment.