CliniDial: A Naturally Occurring Multimodal Dialogue Dataset for Team Reflection in Action During Clinical Operation

📅 2025-06-15

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Empirical modeling of clinical surgical team collaboration remains underexplored; existing methods fail to capture the multimodal, highly interdependent, and low-resource interaction dynamics inherent in real-world operating rooms. Method: We introduce the first natural multimodal dialogue dataset for surgical team collaboration reflection, comprising synchronized audio transcripts, dual-view video, and simulated physiological signals, all behaviorally annotated using an authoritative team collaboration framework. Contribution/Results: We propose the first multimodal evaluation benchmark explicitly designed for label imbalance, strong cross-modal coupling, and high ecological validity. Systematic evaluation reveals that state-of-the-art large language models suffer significant performance degradation under realistic conditions—including noise, asynchrony, and sparse annotations. We publicly release the full dataset, source code, and evaluation toolkit, establishing a new paradigm and foundational resource for advancing team collaboration understanding in medical AI.

Technology Category

Application Category

📝 Abstract

In clinical operations, teamwork can be the crucial factor that determines the final outcome. Prior studies have shown that sufficient collaboration is the key factor that determines the outcome of an operation. To understand how the team practices teamwork during the operation, we collected CliniDial from simulations of medical operations. CliniDial includes the audio data and its transcriptions, the simulated physiology signals of the patient manikins, and how the team operates from two camera angles. We annotate behavior codes following an existing framework to understand the teamwork process for CliniDial. We pinpoint three main characteristics of our dataset, including its label imbalances, rich and natural interactions, and multiple modalities, and conduct experiments to test existing LLMs' capabilities on handling data with these characteristics. Experimental results show that CliniDial poses significant challenges to the existing models, inviting future effort on developing methods that can deal with real-world clinical data. We open-source the codebase at https://github.com/MichiganNLP/CliniDial

Problem

Research questions and friction points this paper is trying to address.

Analyzes teamwork dynamics in clinical operations using multimodal data

Evaluates LLMs on handling imbalanced, natural, and multimodal clinical datasets

Addresses challenges in processing real-world clinical dialogue and behavior data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal dataset with audio and video

Annotated teamwork behavior codes

Tests LLMs on real-world clinical data

🔎 Similar Papers

No similar papers found.