PathReasoner-R1: Instilling Structured Reasoning into Pathology Vision-Language Model via Knowledge-Guided Policy Optimization

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of verifiable, structured reasoning mechanisms in existing pathological vision-language models, which suffer from low clinical credibility and limited error correction capabilities. To this end, we introduce PathReasoner, the first large-scale whole-slide image reasoning dataset, along with the PathReasoner-R1 model. By integrating a medical knowledge graph into the training process through knowledge-guided policy optimization, our approach aligns diagnostic predictions with explicit pathological reasoning. Key innovations include trajectory-masked supervised fine-tuning, reasoning-oriented reinforcement learning, and a reward function combining entity-level and multi-granularity knowledge-aware signals to enhance logical consistency. Experiments demonstrate that PathReasoner-R1 achieves state-of-the-art performance on both the PathReasoner dataset and multiple public pathological benchmarks, significantly improving transparency and clinically trustworthy reasoning across multi-scale histopathological images.

Technology Category

Application Category

📝 Abstract
Vision-Language Models (VLMs) are advancing computational pathology with superior visual understanding capabilities. However, current systems often reduce diagnosis to directly output conclusions without verifiable evidence-linked reasoning, which severely limits clinical trust and hinders expert error rectification. To address these barriers, we construct PathReasoner, the first large-scale dataset of whole-slide image (WSI) reasoning. Unlike previous work reliant on unverified distillation, we develop a rigorous knowledge-guided generation pipeline. By leveraging medical knowledge graphs, we explicitly align structured pathological findings and clinical reasoning with diagnoses, generating over 20K high-quality instructional samples. Based on the database, we propose PathReasoner-R1, which synergizes trajectory-masked supervised fine-tuning with reasoning-oriented reinforcement learning to instill structured chain-of-thought capabilities. To ensure medical rigor, we engineer a knowledge-aware multi-granular reward function incorporating an Entity Reward mechanism strictly aligned with knowledge graphs. This effectively guides the model to optimize for logical consistency rather than mere outcome matching, thereby enhancing robustness. Extensive experiments demonstrate that PathReasoner-R1 achieves state-of-the-art performance on both PathReasoner and public benchmarks across various image scales, equipping pathology models with transparent, clinically grounded reasoning capabilities. Dataset and code are available at https://github.com/cyclexfy/PathReasoner-R1.
Problem

Research questions and friction points this paper is trying to address.

Vision-Language Models
Computational Pathology
Structured Reasoning
Clinical Trust
Evidence-linked Reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge-Guided Policy Optimization
Structured Reasoning
Vision-Language Model
Medical Knowledge Graph
Reinforcement Learning
S
Songhan Jiang
Harbin Institute of Technology (Shenzhen)
F
Fengchun Liu
Harbin Institute of Technology (Shenzhen)
Ziyue Wang
Ziyue Wang
National University of Singapore
Computer VisionMedical Image AnalysisLLM Agents
L
Linghan Cai
Harbin Institute of Technology (Shenzhen), Technical University of Dresden
Yongbing Zhang
Yongbing Zhang
Harbin Institute of Technology (Shenzhen)
video processing and communicationcomputational imaging