🤖 AI Summary
This work addresses the limitations of existing chest X-ray AI diagnostic benchmarks, which are predominantly based on single-center, closed-set settings and struggle with the long-tailed distribution of known pathologies and unseen rare diseases encountered in real-world clinical practice. To bridge this gap, we introduce the first large-scale, multi-center chest X-ray dataset and establish two core tasks: multi-label classification of 30 known pathologies and open-world generalization to six previously unseen rare diseases. For the first time, model robustness is jointly evaluated under realistic multi-center conditions involving both long-tailed and zero-shot settings. Leveraging large-scale vision-language pretraining, the top-performing solution achieves mean average precisions of 0.5854 and 0.4315 on the two tasks, respectively, substantially mitigating performance degradation in zero-shot diagnosis and advancing the development of generalizable, clinically viable AI diagnostic systems.
📝 Abstract
Chest X-ray (CXR) interpretation is hindered by the long-tailed distribution of pathologies and the open-world nature of clinical environments. Existing benchmarks often rely on closed-set classes from single institutions, failing to capture the prevalence of rare diseases or the appearance of novel findings. To address this, we present the CXR-LT 2026 challenge. This third iteration of the benchmark introduces a multi-center dataset comprising over 145,000 images from PadChest and NIH Chest X-ray datasets. The challenge defines two core tasks: (1) Robust Multi-Label Classification on 30 known classes and (2) Open-World Generalization to 6 unseen (out-of-distribution) rare disease classes. We report the results of the top-performing teams, evaluating them via mean Average Precision (mAP), AUROC, and F1-score. The winning solutions achieved an mAP of 0.5854 on Task 1 and 0.4315 on Task 2, demonstrating that large-scale vision-language pre-training significantly mitigates the performance drop typically associated with zero-shot diagnosis.