🤖 AI Summary
Digital pathology analysis relies heavily on expert knowledge, involves complex workflows, and suffers from limited accessibility and efficiency. To address these challenges, we propose PathoAgent—the first agent framework specifically designed for computational pathology—capable of multi-step reasoning, dynamic tool generation, and Python code execution. PathoAgent integrates 49 open-source domain-specific tools (e.g., nuclear segmentation, whole-slide image encoding) to enable adaptive, end-to-end analytical pipeline construction. To rigorously evaluate programming-based reasoning in pathology, we introduce SlideQuest—the first benchmark requiring code-generation and execution for slide-level diagnostic reasoning. On SlideQuest, PathoAgent significantly outperforms existing code-centric agents. Validation by board-certified pathologists confirms that PathoAgent autonomously identifies statistically robust morphological correlates of PAM50 molecular subtypes, demonstrating its efficacy and reliability for automated biomedical discovery.
📝 Abstract
Digitized histopathology analysis involves complex, time-intensive workflows and specialized expertise, limiting its accessibility. We introduce NOVA, an agentic framework that translates scientific queries into executable analysis pipelines by iteratively generating and running Python code. NOVA integrates 49 domain-specific tools (e.g., nuclei segmentation, whole-slide encoding) built on open-source software, and can also create new tools ad hoc. To evaluate such systems, we present SlideQuest, a 90-question benchmark -- verified by pathologists and biomedical scientists -- spanning data processing, quantitative analysis, and hypothesis testing. Unlike prior biomedical benchmarks focused on knowledge recall or diagnostic QA, SlideQuest demands multi-step reasoning, iterative coding, and computational problem solving. Quantitative evaluation shows NOVA outperforms coding-agent baselines, and a pathologist-verified case study links morphology to prognostically relevant PAM50 subtypes, demonstrating its scalable discovery potential.