π€ AI Summary
Traditional large language models are limited in protein function understanding tasks due to fixed knowledge and insufficient generalization capabilities. This work proposes PFUA, a tool-augmented reasoning agent that, for the first time, integrates problem decomposition, tool invocation, and evidence generation into protein function prediction. PFUA leverages domain-specific bioinformatics tools to produce verifiable intermediate evidence, effectively combining external biological priors with interleaved reasoning to overcome the constraints of purely text-based inference. Evaluated across four benchmark datasets, PFUA achieves an average performance improvement of 103% over existing language modelβonly approaches, demonstrating significant superiority in predictive accuracy and biological interpretability.
π Abstract
Recent advances in large language models (LLMs) have highlighted the effectiveness of chain-of-thought reasoning in symbolic domains such as mathematics and programming. However, our study shows that directly transferring such text-based reasoning paradigms to protein function understanding is ineffective: reinforcement learning mainly amplifies superficial keyword patterns while failing to introduce new biological knowledge, resulting in limited generalization. We argue that protein function prediction is a knowledge-intensive scientific task that fundamentally relies on external biological priors and computational tools rather than purely internal reasoning. To address this gap, we propose PFUA, a tool-augmented protein reasoning agent that unifies problem decomposition, tool invocation, and grounded answer generation. Instead of relying on long unconstrained reasoning traces, PFUA integrates domain-specific tools to produce verifiable intermediate evidence. Experiments on four benchmarks demonstrate that PFUA consistently outperforms text-only reasoning models with an average performance improvement of 103%.