🤖 AI Summary
Existing approaches to pathological report generation struggle to align fine-grained diagnostic statements with localized visual evidence and often lack controllability and verifiability. To address these limitations, this work proposes an agent-based framework that emulates the multi-field selective observation process of pathologists. Guided by a user-defined diagnostic checklist, the framework integrates a critique mechanism and text-image semantic retrieval to dynamically relocalize diagnostically relevant regions, enabling iterative refinement of the generated report. This approach yields clinically meaningful, comprehensive reports that can be flexibly controlled according to user needs, significantly enhancing the completeness of diagnostic details and the traceability of supporting visual evidence.
📝 Abstract
Recent methods for pathology report generation from whole-slide image (WSI) are capable of producing slide-level diagnostic descriptions but fail to ground fine-grained statements in localized visual evidence. Furthermore, they lack control over which diagnostic details to include and how to verify them. Inspired by emerging agentic analysis paradigms and the diagnostic workflow of pathologists,who selectively examine multiple fields of view, we propose QCAgent, an agentic framework for quality-controllable WSI report generation. The core innovations of this framework are as follows: (i) it incorporates a customized critique mechanism guided by a user-defined checklist specifying required diagnostic details and constraints; (ii) it re-identifies informative regions in the WSI based on the critique feedback and text-patch semantic retrieval, a process that iteratively enriches and reconciles the report. Experiments demonstrate that by making report requirements explicitly prompt-defined, constraint-aware, and verifiable through evidence-grounded refinement, QCAgent enables controllable generation of clinically meaningful and high-coverage pathology reports from WSI.