DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD Generation

📅 2025-12-09

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

To address the challenge of downstream functional brain analysis when BOLD fMRI data are missing or corrupted, this paper proposes, for the first time, a direct reconstruction method that synthesizes mean BOLD images from T1-weighted structural MRI. Methodologically, we employ a frozen DINOv3 self-supervised encoder to extract robust anatomical representations, incorporate a multi-slice attention mechanism to model cross-layer functional dependencies, and integrate a multi-scale decoder with a DINO-perceptual loss to ensure high-fidelity generation. Evaluated on a clinical dataset comprising 248 subjects, our approach significantly outperforms conditional GAN baselines, achieving state-of-the-art performance in both PSNR and MS-SSIM metrics. This work establishes a novel, interpretable, and high-accuracy paradigm for structure-to-function mapping, enabling reliable functional inference from structural scans alone.

Technology Category

Application Category

📝 Abstract

Generating BOLD images from T1w images offers a promising solution for recovering missing BOLD information and enabling downstream tasks when BOLD images are corrupted or unavailable. Motivated by this, we propose DINO-BOLDNet, a DINOv3-guided multi-slice attention framework that integrates a frozen self-supervised DINOv3 encoder with a lightweight trainable decoder. The model uses DINOv3 to extract within-slice structural representations, and a separate slice-attention module to fuse contextual information across neighboring slices. A multi-scale generation decoder then restores fine-grained functional contrast, while a DINO-based perceptual loss encourages structural and textural consistency between predictions and ground-truth BOLD in the transformer feature space. Experiments on a clinical dataset of 248 subjects show that DINO-BOLDNet surpasses a conditional GAN baseline in both PSNR and MS-SSIM. To our knowledge, this is the first framework capable of generating mean BOLD images directly from T1w images, highlighting the potential of self-supervised transformer guidance for structural-to-functional mapping.

Problem

Research questions and friction points this paper is trying to address.

Generates BOLD images from T1w images

Recovers missing BOLD information for downstream tasks

Integrates DINOv3 for structural-to-functional mapping

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses frozen DINOv3 encoder for structural representation

Employs slice-attention module for cross-slice context fusion

Applies DINO-based perceptual loss for feature consistency

🔎 Similar Papers

NeuroBOLT: Resting-state EEG-to-fMRI Synthesis with Multi-dimensional Feature Mapping