NOVA: An Agentic Framework for Automated Histopathology Analysis and Discovery

📅 2025-11-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Digital pathology analysis relies heavily on expert knowledge, involves complex workflows, and suffers from limited accessibility and efficiency. To address these challenges, we propose PathoAgent—the first agent framework specifically designed for computational pathology—capable of multi-step reasoning, dynamic tool generation, and Python code execution. PathoAgent integrates 49 open-source domain-specific tools (e.g., nuclear segmentation, whole-slide image encoding) to enable adaptive, end-to-end analytical pipeline construction. To rigorously evaluate programming-based reasoning in pathology, we introduce SlideQuest—the first benchmark requiring code-generation and execution for slide-level diagnostic reasoning. On SlideQuest, PathoAgent significantly outperforms existing code-centric agents. Validation by board-certified pathologists confirms that PathoAgent autonomously identifies statistically robust morphological correlates of PAM50 molecular subtypes, demonstrating its efficacy and reliability for automated biomedical discovery.

Technology Category

Application Category

📝 Abstract
Digitized histopathology analysis involves complex, time-intensive workflows and specialized expertise, limiting its accessibility. We introduce NOVA, an agentic framework that translates scientific queries into executable analysis pipelines by iteratively generating and running Python code. NOVA integrates 49 domain-specific tools (e.g., nuclei segmentation, whole-slide encoding) built on open-source software, and can also create new tools ad hoc. To evaluate such systems, we present SlideQuest, a 90-question benchmark -- verified by pathologists and biomedical scientists -- spanning data processing, quantitative analysis, and hypothesis testing. Unlike prior biomedical benchmarks focused on knowledge recall or diagnostic QA, SlideQuest demands multi-step reasoning, iterative coding, and computational problem solving. Quantitative evaluation shows NOVA outperforms coding-agent baselines, and a pathologist-verified case study links morphology to prognostically relevant PAM50 subtypes, demonstrating its scalable discovery potential.
Problem

Research questions and friction points this paper is trying to address.

Automating complex histopathology workflows requiring specialized expertise
Translating scientific queries into executable computational analysis pipelines
Enabling scalable discovery through multi-step reasoning and iterative coding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic framework translates queries into executable pipelines
Integrates 49 domain-specific tools and creates new ones
Outperforms coding-agent baselines in quantitative evaluation
🔎 Similar Papers
No similar papers found.
Anurag Vaidya
Anurag Vaidya
MIT
Multimodal frontier modelsagentsclinical deployment
Felix Meissen
Felix Meissen
Microsoft Research
Machine Learning and Computer Vision for Medical Applications
D
Daniel C. Castro
Microsoft Health Futures, Cambridge, UK
Shruthi Bannur
Shruthi Bannur
Microsoft Research
Machine LearningDeep LearningComputer VisionNatural Language Processing
T
Tristan Lazard
Microsoft Health Futures, Cambridge, UK
D
Drew F. K. Williamson
Emory University, Atlanta, USA
Faisal Mahmood
Faisal Mahmood
Associate Professor, Harvard University
J
Javier Alvarez-Valle
Microsoft Health Futures, Cambridge, UK
S
Stephanie L. Hyland
Microsoft Health Futures, Cambridge, UK
Kenza Bouzid
Kenza Bouzid
Microsoft Research
Machine LearningComputer Vision