GazeXPErT: An Expert Eye-tracking Dataset for Interpretable and Explainable AI in Oncologic FDG-PET/CT Scans

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the shortage of clinical experts and the limited interpretability and workflow integration of existing AI models in oncological FDG-PET/CT image analysis. To bridge this gap, we present the first 4D dataset integrating high-frequency (60 Hz) eye-tracking data with 346 FDG-PET/CT scans, synchronously capturing radiologists’ visual search trajectories and lesion annotations, structured in COCO format and comprising 9,030 gaze-to-lesion correspondence sequences. This resource enables research in visual grounding, intention prediction, and interpretable modeling. Experimental results demonstrate that incorporating expert gaze information improves nnUNet’s segmentation DICE score from 0.6008 to 0.6819. Furthermore, a Vision Transformer model aligns 74.95% of predicted fixations closer to tumors, achieving an intention prediction accuracy of 67.53% (AUROC: 0.747).

Technology Category

Application Category

📝 Abstract
[18F]FDG-PET/CT is a cornerstone imaging modality for tumor staging and treatment response assessment across many cancer types, yet expert reader shortages necessitate more efficient diagnostic aids. While standalone AI models for automatic lesion segmentation exist, clinical translation remains hindered by concerns about interpretability, explainability, reliability, and workflow integration. We present GazeXPErT, a 4D eye-tracking dataset capturing expert search patterns during tumor detection and measurement on 346 FDG-PET/CT scans. Each study was read by a trainee and a board-certified nuclear medicine or radiology specialist using an eye-tracking-enabled annotation platform that simulates routine clinical reads. From 3,948 minutes of raw 60Hz eye-tracking data, 9,030 unique gaze-to-lesion trajectories were extracted, synchronized with PET/CT image slices, and rendered in COCO-style format for multiple machine learning applications. Baseline validation experiments demonstrate that a 3D nnUNet tumor segmentation model achieved superior performance when incorporating expert gaze patterns versus without (DICE score 0.6819 versus 0.6008), and that vision transformers trained on sequential gaze and PET/CT images can improve dynamic lesion localization (74.95% predicted gaze point closer to tumor) and expert intention prediction (Accuracy 67.53% and AUROC 0.747). GazeXPErT is a valuable resource designed to explore multiple machine learning problems beyond these baseline experiments, which include and are not limited to, visual grounding or causal reasoning, clinically explainable feature augmentation, human-computer interaction, human intention prediction or understanding, and expert gaze-rewarded modeling approaches to AI in oncologic FDG-PET/CT imaging.
Problem

Research questions and friction points this paper is trying to address.

interpretability
explainability
FDG-PET/CT
expert eye-tracking
clinical AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

eye-tracking
explainable AI
FDG-PET/CT
gaze-guided learning
clinical interpretability
🔎 Similar Papers
No similar papers found.
J
Joy T Wu
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA
D
Daniel Beckmann
University of Münster, Computer Vision and Machine Learning Systems, Heisenbergstraße 2, Münster 48149, Germany
Sarah Miller
Sarah Miller
University of Michigan, Ross School of Business
Health economics
Alexander Lee
Alexander Lee
Professor of Political Science, University of Rochester
E
Elizabeth Theng
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA
S
Stephan Altmayer
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA
Ken Chang
Ken Chang
Stanford University
Machine LearningMedical ImagingDistributed Learning
D
David Kersting
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA; Department of Nuclear Medicine, Essen University Hospital, West German Cancer Center, University of Duisburg-Essen, Germany
T
Tomoaki Otani
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA; Department of Diagnostic Imaging and Nuclear Medicine, Kyoto University, Japan
B
Brittany Z Dashevsky
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA
H
Hye Lim Park
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA; Division of Nuclear Medicine, Department of Radiology, Eunpyeong St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
M
Matteo Novello
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA
K
Kip Guja
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA
C
Curtis Langlotz
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA
Ismini Lourentzou
Ismini Lourentzou
Assistant Professor, University of Illinois Urbana - Champaign
Machine LearningNatural Language ProcessingComputer Vision
Daniel Gruhl
Daniel Gruhl
Google
Artifical IntelligenceText AnalyticsStegonographyHuman-in-the-Loop
Benjamin Risse
Benjamin Risse
Faculty of Mathematics & Computer Science, University of Münster, Germany
Computer VisionMachine LearningEcologyAdditive ManufacturingBiomedical Image Processing
G
Guido A Davidzon
Division of Nuclear Medicine, Department of Radiology, Stanford University/Stanford Health Care, Stanford, CA 94305, USA