OpenUS: A Fully Open-Source Foundation Model for Ultrasound Image Analysis via Self-Adaptive Masked Contrastive Learning

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Ultrasound imaging faces significant challenges—including operator dependency, substantial variability across devices and anatomical domains, prominent speckle noise, and scarce expert annotations—severely limiting the generalizability and label efficiency of AI models. To address these, we propose the first fully open-source, reproducible ultrasound foundation model. Our method introduces an adaptive masking framework that jointly leverages teacher-guided attention and student reconstruction loss to dynamically emphasize clinically salient regions; integrates contrastive learning with masked image modeling; adopts a Vision Mamba backbone to capture long-range spatial dependencies; and employs a progressive dynamic learning schedule to enhance pretraining robustness. Evaluated on our newly curated, publicly available ultrasound dataset of 308,000 images—the largest to date—the model demonstrates exceptional cross-domain transfer performance across multiple anatomical sites and disease tasks, enabling highly label-efficient downstream fine-tuning. The code, pretrained weights, and dataset are fully open-sourced.

Technology Category

Application Category

📝 Abstract

Ultrasound (US) is one of the most widely used medical imaging modalities, thanks to its low cost, portability, real-time feedback, and absence of ionizing radiation. However, US image interpretation remains highly operator-dependent and varies significantly across anatomical regions, acquisition protocols, and device types. These variations, along with unique challenges such as speckle, low contrast, and limited standardized annotations, hinder the development of generalizable, label-efficient ultrasound AI models. In this paper, we propose OpenUS, the first reproducible, open-source ultrasound foundation model built on a large collection of public data. OpenUS employs a vision Mamba backbone, capturing both local and global long-range dependencies across the image. To extract rich features during pre-training, we introduce a novel self-adaptive masking framework that combines contrastive learning with masked image modeling. This strategy integrates the teacher's attention map with student reconstruction loss, adaptively refining clinically-relevant masking to enhance pre-training effectiveness. OpenUS also applies a dynamic learning schedule to progressively adjust the difficulty of the pre-training process. To develop the foundation model, we compile the largest to-date public ultrasound dataset comprising over 308K images from 42 publicly available datasets, covering diverse anatomical regions, institutions, imaging devices, and disease types. Our pre-trained OpenUS model can be easily adapted to specific downstream tasks by serving as a backbone for label-efficient fine-tuning. Code is available at https://github.com/XZheng0427/OpenUS.

Problem

Research questions and friction points this paper is trying to address.

Developing generalizable AI models for ultrasound image analysis across diverse conditions

Addressing operator dependency and variability in ultrasound interpretation

Overcoming limited annotations and unique ultrasound imaging challenges

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision Mamba backbone captures long-range dependencies

Self-adaptive masking combines contrastive and reconstruction learning

Dynamic learning schedule adjusts pre-training difficulty progressively

🔎 Similar Papers

No similar papers found.