Interpreting Radiologist's Intention from Eye Movements in Chest X-ray Diagnosis

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Existing models struggle to model disease-directed diagnostic intent implicitly encoded in radiologists’ eye-tracking trajectories during chest X-ray interpretation. To address this, we propose RadGazeIntent—the first method to explicitly model diagnostic intent in medical image analysis—using a spatiotemporal joint Transformer architecture that maps fine-grained gaze sequences to coarse-grained, semantically meaningful diagnostic intent representations. We introduce the first eye-movement dataset specifically designed for intent recognition, comprising three annotation paradigms: RadSeq (sequential lesion identification), RadExplore (exploratory scanning), and RadHybrid (mixed strategies). Our analysis systematically characterizes distinct oculomotor patterns associated with each diagnostic intent. Experiments demonstrate that RadGazeIntent significantly outperforms baseline methods on multi-intent prediction, accurately localizing and classifying targeted pathologies during dynamic reading. This work establishes a novel paradigm for interpretable, intent-aware AI-assisted diagnosis in radiology.

Technology Category

Application Category

📝 Abstract

Radiologists rely on eye movements to navigate and interpret medical images. A trained radiologist possesses knowledge about the potential diseases that may be present in the images and, when searching, follows a mental checklist to locate them using their gaze. This is a key observation, yet existing models fail to capture the underlying intent behind each fixation. In this paper, we introduce a deep learning-based approach, RadGazeIntent, designed to model this behavior: having an intention to find something and actively searching for it. Our transformer-based architecture processes both the temporal and spatial dimensions of gaze data, transforming fine-grained fixation features into coarse, meaningful representations of diagnostic intent to interpret radiologists' goals. To capture the nuances of radiologists' varied intention-driven behaviors, we process existing medical eye-tracking datasets to create three intention-labeled subsets: RadSeq (Systematic Sequential Search), RadExplore (Uncertainty-driven Exploration), and RadHybrid (Hybrid Pattern). Experimental results demonstrate RadGazeIntent's ability to predict which findings radiologists are examining at specific moments, outperforming baseline methods across all intention-labeled datasets.

Problem

Research questions and friction points this paper is trying to address.

Model radiologists' diagnostic intent from eye movements

Predict findings examined during chest X-ray diagnosis

Classify gaze patterns into intention-driven search behaviors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning models radiologists' gaze intent

Transformer processes spatial-temporal gaze data

Three intention-labeled datasets enhance prediction

🔎 Similar Papers

No similar papers found.