GLeVE: Graph-Guided Lesion Grounding with Proposal Verification in 3D CT

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This study addresses the inaccurate lesion localization arising from the semantic-spatial gap between radiology report texts and 3D CT images by proposing a graph-guided, lesion-level alignment framework. The method models lesion descriptions as atomic semantic units and employs a relation-aware graph neural network to infer organ affiliation, attributes, and inter-lesion relationships, thereby generating discriminative lesion queries. A region-level proposal verification mechanism, guided by anatomical priors, enforces one-to-one correspondence between textual descriptions and lesions. Furthermore, an octree-based autoregressive strategy progressively refines lesion boundaries in a hierarchical manner. Experiments on AbdomenAtlas 3.0 demonstrate that the proposed approach significantly outperforms existing baselines, achieving consistent improvements in both lesion segmentation accuracy and localization precision.

📝 Abstract

Grounding radiology report descriptions to 3D CT volumes is essential for verifiable clinical interpretation, yet remains challenging due to the semantic-spatial gap between free-text narratives and volumetric anatomy. Existing report-assisted and vision-language grounding methods typically rely on phrase-level alignment or dense pixel supervision, resulting in limited lesion-wise correspondence and suboptimal localization accuracy. We propose GLeVE, a graph-guided lesion grounding framework with anatomical prior verification and octree-based autoregressive refinement. GLeVE treats each lesion description as an atomic semantic unit and encodes organ attribution, attributes, and inter-lesion relations through relation-aware graph reasoning to produce discriminative lesion-wise queries. Anatomy-aware proposal generation with region-level verification enforces one-to-one text-lesion alignment, while hierarchical octree refinement progressively improves boundary delineation. Experiments on AbdomenAtlas 3.0 demonstrate consistent gains over classical multimodal foundation models and report-supervised baselines in both segmentation accuracy and lesion-level localization.

Problem

Research questions and friction points this paper is trying to address.

lesion grounding

3D CT

radiology report

semantic-spatial gap

clinical interpretation

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph-guided grounding

anatomical prior verification

octree-based refinement