🤖 AI Summary
Addressing challenges in CT imaging—including scattered and low-contrast 3D lymph node (LN) detection, weak inter-slice consistency modeling, and heavy reliance on post-processing—this paper proposes LN-Tracker, the first end-to-end framework that formulates 3D LN detection as an autoregressive tracking task along the z-axis. Its core innovations include: (1) decoupling detection and tracking queries; (2) introducing mask-guided cross-slice attention; (3) designing an inter-slice similarity contrastive loss; and (4) integrating multi-scale features. Built upon an enhanced DETR architecture, LN-Tracker jointly optimizes detection and instance association. Evaluated on four major LN datasets, it achieves ≥2.7% average sensitivity gain. Moreover, it generalizes effectively to pulmonary nodule and prostate tumor detection, attaining state-of-the-art performance in both tasks. The code and a newly annotated, publicly available dataset are released.
📝 Abstract
Lymph node (LN) assessment is an essential task in the routine radiology workflow, providing valuable insights for cancer staging, treatment planning and beyond. Identifying scatteredly-distributed and low-contrast LNs in 3D CT scans is highly challenging, even for experienced clinicians. Previous lesion and LN detection methods demonstrate effectiveness of 2.5D approaches (i.e, using 2D network with multi-slice inputs), leveraging pretrained 2D model weights and showing improved accuracy as compared to separate 2D or 3D detectors. However, slice-based 2.5D detectors do not explicitly model inter-slice consistency for LN as a 3D object, requiring heuristic post-merging steps to generate final 3D LN instances, which can involve tuning a set of parameters for each dataset. In this work, we formulate 3D LN detection as a tracking task and propose LN-Tracker, a novel LN tracking transformer, for joint end-to-end detection and 3D instance association. Built upon DETR-based detector, LN-Tracker decouples transformer decoder's query into the track and detection groups, where the track query autoregressively follows previously tracked LN instances along the z-axis of a CT scan. We design a new transformer decoder with masked attention module to align track query's content to the context of current slice, meanwhile preserving detection query's high accuracy in current slice. An inter-slice similarity loss is introduced to encourage cohesive LN association between slices. Extensive evaluation on four lymph node datasets shows LN-Tracker's superior performance, with at least 2.7% gain in average sensitivity when compared to other top 3D/2.5D detectors. Further validation on public lung nodule and prostate tumor detection tasks confirms the generalizability of LN-Tracker as it achieves top performance on both tasks. Datasets will be released upon acceptance.