IMACT-CXR - An Interactive Multi-Agent Conversational Tutoring System for Chest X-Ray Interpretation

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical interns face challenges in concurrently developing spatial localization, visual attention, knowledge integration, and diagnostic reasoning skills during chest X-ray (CXR) interpretation training. Method: We propose the first multi-agent instructional framework—built on AutoGen—that integrates gaze tracking, anatomical segmentation (TensorFlow U-Net), and Bayesian knowledge tracing. The system incorporates NV-Reason-CXR-3B multimodal reasoning, PubMed real-time literature retrieval, REFLACX case matching, and safety-aware prompting to deliver context-aware, dynamic tutoring and personalized feedback. Contribution/Results: Our innovation lies in jointly leveraging eye-tracking data, pixel-level lobar segmentation, and cognitive state modeling to drive adaptive pedagogical strategies, enabling precise skill assessment and sub-second response latency. Experiments demonstrate statistically significant improvements over baselines in lesion localization accuracy (+12.4%) and diagnostic reasoning quality (+18.7%), with validated clinical deployability and rigorous information leakage control.

Technology Category

Application Category

📝 Abstract
IMACT-CXR is an interactive multi-agent conversational tutor that helps trainees interpret chest X-rays by unifying spatial annotation, gaze analysis, knowledge retrieval, and image-grounded reasoning in a single AutoGen-based workflow. The tutor simultaneously ingests learner bounding boxes, gaze samples, and free-text observations. Specialized agents evaluate localization quality, generate Socratic coaching, retrieve PubMed evidence, suggest similar cases from REFLACX, and trigger NV-Reason-CXR-3B for vision-language reasoning when mastery remains low or the learner explicitly asks. Bayesian Knowledge Tracing (BKT) maintains skill-specific mastery estimates that drive both knowledge reinforcement and case similarity retrieval. A lung-lobe segmentation module derived from a TensorFlow U-Net enables anatomically aware gaze feedback, and safety prompts prevent premature disclosure of ground-truth labels. We describe the system architecture, implementation highlights, and integration with the REFLACX dataset for real DICOM cases. IMACT-CXR demonstrates responsive tutoring flows with bounded latency, precise control over answer leakage, and extensibility toward live residency deployment. Preliminary evaluation shows improved localization and diagnostic reasoning compared to baselines.
Problem

Research questions and friction points this paper is trying to address.

Develops interactive tutoring system for chest X-ray interpretation training
Integrates spatial annotation, gaze analysis and knowledge retrieval in workflow
Improves medical trainees' localization and diagnostic reasoning skills
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent conversational tutoring with AutoGen workflow
Bayesian Knowledge Tracing for skill mastery estimation
Lung-lobe segmentation enables anatomically aware gaze feedback
🔎 Similar Papers
No similar papers found.