MoNetV2: Enhanced Motion Network for Freehand 3-D Ultrasound Reconstruction.

📅 2025-06-11

🏛️ IEEE Transactions on Neural Networks and Learning Systems

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

To address large cumulative drift and degraded accuracy under complex motion trajectories in freehand 3D ultrasound reconstruction, this paper proposes an external-tracking-free multimodal self-supervised framework. Methodologically, we design a sensor-driven temporal multi-branch network (TMS) that fuses ultrasound images with inertial measurement unit (IMU) data; introduce online multi-level consistency constraints (MCC) to jointly model motion continuity at scan-, path-, and patch-levels; and develop a multimodal self-supervised distillation strategy (MSS) to enhance generalization. Evaluated on three large public datasets, our method achieves 3.2–5.8% improvements in PSNR/SSIM and reduces cumulative pose error by 41%, demonstrating superior robustness to variable scanning speeds and diverse acquisition protocols. To the best of our knowledge, it is the first end-to-end framework to explicitly model cross-scale motion consistency without external tracking.

Technology Category

Application Category

📝 Abstract

Three-dimensional ultrasound (US) aims to provide sonographers with the spatial relationships of anatomical structures, playing a crucial role in clinical diagnosis. Recently, deep-learning-based freehand 3-D US has made significant advancements. It reconstructs volumes by estimating transformations between images without external tracking. However, image-only reconstruction poses difficulties in reducing cumulative drift and further improving reconstruction accuracy, particularly in scenarios involving complex motion trajectories. In this context, we propose an enhanced motion network (MoNetV2) to enhance the accuracy and generalizability of reconstruction under diverse scanning velocities and tactics. First, we propose a sensor-based temporal and multibranch structure (TMS) that fuses image and motion information from a velocity perspective to improve image-only reconstruction accuracy. Second, we devise an online multilevel consistency constraint (MCC) that exploits the inherent consistency of scans to handle various scanning velocities and tactics. This constraint exploits scan-level velocity consistency (SVC), path-level appearance consistency (PAC), and patch-level motion consistency (PMC) to supervise interframe transformation estimation. Third, we distill an online multimodal self-supervised strategy (MSS) that leverages the correlation between network estimation and motion information to further reduce cumulative errors. Extensive experiments clearly demonstrate that MoNetV2 surpasses existing methods in both reconstruction quality and generalizability performance across three large datasets.

Problem

Research questions and friction points this paper is trying to address.

Reduces cumulative drift in freehand 3D ultrasound reconstruction

Improves accuracy under diverse scanning velocities and tactics

Enhances generalizability using multi-modal self-supervised strategy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sensor-based temporal and multi-branch structure fusion

Online multi-level consistency constraint strategy

Online multi-modal self-supervised error reduction

🔎 Similar Papers

No similar papers found.