Interleaved Tool-Call Reasoning for Protein Function Understanding

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Traditional large language models are limited in protein function understanding tasks due to fixed knowledge and insufficient generalization capabilities. This work proposes PFUA, a tool-augmented reasoning agent that, for the first time, integrates problem decomposition, tool invocation, and evidence generation into protein function prediction. PFUA leverages domain-specific bioinformatics tools to produce verifiable intermediate evidence, effectively combining external biological priors with interleaved reasoning to overcome the constraints of purely text-based inference. Evaluated across four benchmark datasets, PFUA achieves an average performance improvement of 103% over existing language model–only approaches, demonstrating significant superiority in predictive accuracy and biological interpretability.

Technology Category

Application Category

📝 Abstract

Recent advances in large language models (LLMs) have highlighted the effectiveness of chain-of-thought reasoning in symbolic domains such as mathematics and programming. However, our study shows that directly transferring such text-based reasoning paradigms to protein function understanding is ineffective: reinforcement learning mainly amplifies superficial keyword patterns while failing to introduce new biological knowledge, resulting in limited generalization. We argue that protein function prediction is a knowledge-intensive scientific task that fundamentally relies on external biological priors and computational tools rather than purely internal reasoning. To address this gap, we propose PFUA, a tool-augmented protein reasoning agent that unifies problem decomposition, tool invocation, and grounded answer generation. Instead of relying on long unconstrained reasoning traces, PFUA integrates domain-specific tools to produce verifiable intermediate evidence. Experiments on four benchmarks demonstrate that PFUA consistently outperforms text-only reasoning models with an average performance improvement of 103%.

Problem

Research questions and friction points this paper is trying to address.

protein function understanding

chain-of-thought reasoning

knowledge-intensive task

biological priors

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

tool-augmented reasoning

protein function prediction

knowledge-intensive reasoning