Training-inference input alignment outweighs framework choice in longitudinal retinal image prediction

📅 2026-04-18

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This study addresses the challenge of forecasting future retinal images in slowly progressive diseases by systematically evaluating various conditional modeling strategies and revealing that aligning input distributions between training and inference is more critical than the choice of generative framework. Building on this insight, the authors propose TRU, a deterministic temporal prediction model that integrates continuous-time embeddings, multi-scale historical feature aggregation, and a U-Net regression architecture while eliminating unnecessary stochasticity. Evaluated across three imaging platforms encompassing 28,902 eyes, TRU matches or exceeds state-of-the-art methods in delta-SSIM, SSIM, and PSNR metrics. Its performance consistently improves with longer historical sequences and demonstrates zero-shot transferability across imaging devices.

Technology Category

Application Category

📝 Abstract

Quantitative prediction of future retinal appearance from longitudinal imaging would support clinical decisions in progressive macular disease that currently rely on qualitative comparison or scalar progression scores. Recent methods have moved toward increasing generative complexity, but whether this complexity is necessary for slowly progressing retinal disease is unclear. We tested this through a controlled comparison of five conditioning configurations sharing one architecture and training dataset, spanning standard conditional diffusion, inference-aligned stochastic training, and deterministic regression. In our evaluation, aligning the training and inference input distributions produced large gains (delta-SSIM +0.082, SSIM +0.086, both p < 0.001), while the choice among aligned frameworks did not significantly affect any primary metric. Task-entropy and posterior-concentration analyses, replicated on two fundus autofluorescence (FAF) platforms, provided a mechanistic account: the predictable component of inter-visit change is small relative to time-invariant acquisition variability, leaving stochastic sampling with little width to exploit. Guided by these findings, we developed TRU (Temporal Retinal U-Net), a deterministic direct-regression model with continuous time-delta conditioning and multi-scale history aggregation. We evaluated TRU on 28,902 eyes across three imaging platforms: a mixed-disease Optos FAF cohort (9,942 eyes), zero-shot transfer to Stargardt macular dystrophy on Optos (288 eyes) and Heidelberg Spectralis (125 eyes), and a boundary evaluation on Cirrus en-face fundus images from a glaucoma cohort (18,547 eyes). TRU matched or exceeded delta-SSIM, SSIM, and PSNR in every FAF cohort against three state-of-the-art benchmarks, and its advantage grew monotonically with available history length.

Problem

Research questions and friction points this paper is trying to address.

longitudinal retinal image prediction

macular disease

quantitative prediction

temporal modeling

clinical decision support

Innovation

Methods, ideas, or system contributions that make the work stand out.

input alignment

deterministic regression

longitudinal retinal prediction