VitalBench: A Rigorous Multi-Center Benchmark for Long-Term Vital Sign Prediction in Intraoperative Care

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Intraoperative long-term vital sign forecasting faces critical challenges including the absence of standardized benchmarks, incomplete clinical data, and insufficient cross-center validation. To address these, we introduce the first standardized multi-center intraoperative vital sign prediction benchmark—comprising over 4,000 surgical cases—and define three evaluation tracks: full-observation forecasting, simulated missing-data robustness, and cross-center generalization. We propose a masked loss function that reduces reliance on preprocessing and enhances robustness to clinically realistic missingness patterns. Our approach leverages deep learning–based time-series modeling for end-to-end training and evaluation on multi-center electronic health records. This benchmark significantly improves model comparability and clinical relevance, establishing a unified evaluation platform for developing generalizable, deployment-ready intraoperative prediction models.

Technology Category

Application Category

📝 Abstract

Intraoperative monitoring and prediction of vital signs are critical for ensuring patient safety and improving surgical outcomes. Despite recent advances in deep learning models for medical time-series forecasting, several challenges persist, including the lack of standardized benchmarks, incomplete data, and limited cross-center validation. To address these challenges, we introduce VitalBench, a novel benchmark specifically designed for intraoperative vital sign prediction. VitalBench includes data from over 4,000 surgeries across two independent medical centers, offering three evaluation tracks: complete data, incomplete data, and cross-center generalization. This framework reflects the real-world complexities of clinical practice, minimizing reliance on extensive preprocessing and incorporating masked loss techniques for robust and unbiased model evaluation. By providing a standardized and unified platform for model development and comparison, VitalBench enables researchers to focus on architectural innovation while ensuring consistency in data handling. This work lays the foundation for advancing predictive models for intraoperative vital sign forecasting, ensuring that these models are not only accurate but also robust and adaptable across diverse clinical environments. Our code and data are available at https://github.com/XiudingCai/VitalBench.

Problem

Research questions and friction points this paper is trying to address.

Standardizing benchmarks for intraoperative vital sign prediction

Addressing incomplete data and cross-center validation challenges

Enabling robust model evaluation across diverse clinical environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-center benchmark for intraoperative vital sign prediction

Three evaluation tracks with complete and incomplete data

Masked loss techniques for robust model evaluation

🔎 Similar Papers

No similar papers found.