EyeWorld: A Generative World Model of Ocular State and Dynamics

📅 2026-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of ophthalmic diagnosis and treatment, which relies on subtle lesions and their temporal evolution in multimodal retinal imaging, yet existing medical foundation models are predominantly static and struggle with modality discrepancies and variations in image quality. To overcome this limitation, the study introduces dynamic system modeling into ophthalmic AI for the first time, conceptualizing the eye as a partially observable dynamic system and constructing a unified generative world model. By leveraging a shared latent state, the model achieves cross-modal alignment, temporally consistent state representations, and structure-preserving cross-modal translation, further enhanced by longitudinal temporal supervision. The proposed approach significantly improves robustness in multimodal retinal image understanding, enabling high-quality cross-modal synthesis and anatomically stable prediction of lesion progression.

Technology Category

Application Category

📝 Abstract
Ophthalmic decision-making depends on subtle lesion-scale cues interpreted across multimodal imaging and over time, yet most medical foundation models remain static and degrade under modality and acquisition shifts. Here we introduce EyeWorld, a generative world model that conceptualizes the eye as a partially observed dynamical system grounded in clinical imaging. EyeWorld learns an observation-stable latent ocular state shared across modalities, unifying fine-grained parsing, structure-preserving cross-modality translation and quality-robust enhancement within a single framework. Longitudinal supervision further enables time-conditioned state transitions, supporting forecasting of clinically meaningful progression while preserving stable anatomy. By moving from static representation learning to explicit dynamical modeling, EyeWorld provides a unified approach to robust multimodal interpretation and prognosis-oriented simulation in medicine.
Problem

Research questions and friction points this paper is trying to address.

ophthalmic decision-making
multimodal imaging
dynamical system
lesion-scale cues
clinical progression forecasting
Innovation

Methods, ideas, or system contributions that make the work stand out.

generative world model
ocular dynamics
multimodal medical imaging
longitudinal modeling
cross-modality translation
🔎 Similar Papers
No similar papers found.
Z
Ziyu Gao
School of Optometry, The Hong Kong Polytechnic University, Hong Kong.
X
Xinyuan Wu
School of Optometry, The Hong Kong Polytechnic University, Hong Kong.
X
Xiaolan Chen
School of Optometry, The Hong Kong Polytechnic University, Hong Kong.
Z
Zhuoran Liu
School of Optometry, The Hong Kong Polytechnic University, Hong Kong.
Ruoyu Chen
Ruoyu Chen
Institute of Information Engineering, Chinese Academy of Sciences.
Explainable AITrustworthy AIFoundation Model
Bowen Liu
Bowen Liu
Andreessen Horowitz, insitro, Stanford
Computational ChemistryDrug DiscoveryGraph Machine Learning
B
Bingjie Yan
School of Optometry, The Hong Kong Polytechnic University, Hong Kong.
Z
Zhenhan Wang
School of Optometry, The Hong Kong Polytechnic University, Hong Kong.
Kai Jin
Kai Jin
Zhejiang University
OphthalmologyArtificial Intelligence,Precision Medicine
Jiancheng Yang
Jiancheng Yang
ELLIS Institute Finland & Aalto University
AI for Health3D VisionMedical Image AnalysisMachine LearningSpatial Intelligence
Yih Chung Tham
Yih Chung Tham
Yong Loo Lin School of Medicine, National University of Singapore; Singapore Eye Research Institute
OphthalmologyEpidemiologyVisual ImpairmentDeep Learning
Mingguang He
Mingguang He
The Hong Kong Polytechnic University
Ophthalmoogy
D
Danli Shi
School of Optometry, The Hong Kong Polytechnic University, Hong Kong; Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Hong Kong.