CT-Flow: Orchestrating CT Interpretation Workflow with Model Context Protocol Servers

📅 2026-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing 3D CT analysis methods, which predominantly rely on static, single-pass inference and fail to emulate the dynamic, tool-assisted interpretation process employed by radiologists. To bridge this gap, we propose the first agent framework based on the Model Context Protocol (MCP), introducing a tool-aware dynamic workflow paradigm that decomposes complex natural language queries into executable sequences of multi-step tool invocations, enabling seamless integration with clinical tools. We further construct CT-FlowBench, the first instruction-tuning benchmark tailored for 3D CT tool usage, integrating large vision-language models, 3D segmentation, and radiomics. Experiments demonstrate that our approach achieves state-of-the-art performance on both CT-FlowBench and standard 3D visual question answering benchmarks, surpassing baseline methods by 41% in diagnostic accuracy and attaining a 95% success rate in automated tool invocation.

Technology Category

Application Category

📝 Abstract
Recent advances in Large Vision-Language Models (LVLMs) have shown strong potential for multi-modal radiological reasoning, particularly in tasks like diagnostic visual question answering (VQA) and radiology report generation. However, most existing approaches for 3D CT analysis largely rely on static, single-pass inference. In practice, clinical interpretation is a dynamic, tool-mediated workflow where radiologists iteratively review slices and use measurement, radiomics, and segmentation tools to refine findings. To bridge this gap, we propose CT-Flow, an agentic framework designed for interoperable volumetric interpretation. By leveraging the Model Context Protocol (MCP), CT-Flow shifts from closed-box inference to an open, tool-aware paradigm. We curate CT-FlowBench, the first large-scale instruction-tuning benchmark tailored for 3D CT tool-use and multi-step reasoning. Built upon this, CT-Flow functions as a clinical orchestrator capable of decomposing complex natural language queries into automated tool-use sequences. Experimental evaluations on CT-FlowBench and standard 3D VQA datasets demonstrate that CT-Flow achieves state-of-the-art performance, surpassing baseline models by 41% in diagnostic accuracy and achieving a 95% success rate in autonomous tool invocation. This work provides a scalable foundation for integrating autonomous, agentic intelligence into real-world clinical radiology.
Problem

Research questions and friction points this paper is trying to address.

3D CT interpretation
clinical workflow
tool-mediated reasoning
multi-step reasoning
radiology automation
Innovation

Methods, ideas, or system contributions that make the work stand out.

CT-Flow
Model Context Protocol
tool-aware reasoning
3D CT interpretation
agentic workflow
🔎 Similar Papers
No similar papers found.