VGAT: A Cancer Survival Analysis Framework Transitioning from Generative Visual Question Answering to Genomic Reconstruction

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of limited access to genomic sequencing data in resource-constrained settings—hindering clinical deployment of multimodal cancer survival analysis—this paper proposes a genome-aware survival prediction method relying solely on whole-slide images (WSIs). Methodologically, it introduces three key components: (1) a VQA-inspired genomic feature decoder—the first adaptation of visual question answering to genomic representation learning; (2) a clustering-driven visual prompting module to enhance discriminative region modeling; and (3) a vision–genomics joint Transformer with multi-scale WSI patch encoding. Evaluated across five cancer types in TCGA, the method achieves an average C-index improvement of 0.042 over state-of-the-art image-only approaches, demonstrating that high-accuracy survival prediction is feasible without raw genomic data. This work bridges a critical gap between computational pathology and precision oncology in low-resource environments.

Technology Category

Application Category

📝 Abstract
Multimodal learning combining pathology images and genomic sequences enhances cancer survival analysis but faces clinical implementation barriers due to limited access to genomic sequencing in under-resourced regions. To enable survival prediction using only whole-slide images (WSI), we propose the Visual-Genomic Answering-Guided Transformer (VGAT), a framework integrating Visual Question Answering (VQA) techniques for genomic modality reconstruction. By adapting VQA's text feature extraction approach, we derive stable genomic representations that circumvent dimensionality challenges in raw genomic data. Simultaneously, a cluster-based visual prompt module selectively enhances discriminative WSI patches, addressing noise from unfiltered image regions. Evaluated across five TCGA datasets, VGAT outperforms existing WSI-only methods, demonstrating the viability of genomic-informed inference without sequencing. This approach bridges multimodal research and clinical feasibility in resource-constrained settings. The code link is https://github.com/CZZZZZZZZZZZZZZZZZ/VGAT.
Problem

Research questions and friction points this paper is trying to address.

Enables cancer survival prediction without genomic sequencing
Reconstructs genomic data from pathology images using VQA techniques
Improves WSI analysis by filtering noise and enhancing discriminative patches
Innovation

Methods, ideas, or system contributions that make the work stand out.

VGAT integrates VQA for genomic reconstruction
Cluster-based visual prompt enhances WSI patches
Derives stable genomic representations from images
🔎 Similar Papers
No similar papers found.
Zizhi Chen
Zizhi Chen
Fudan university
Pathology Images
M
Minghao Han
Academy for Engineering and Technology, Fudan University, Shanghai, China; Institute of Metaverse & Intelligent Medicine, Fudan University, Shanghai, China
Xukun Zhang
Xukun Zhang
Fudan University;
S
Shuwei Ma
Academy for Engineering and Technology, Fudan University, Shanghai, China; Institute of Metaverse & Intelligent Medicine, Fudan University, Shanghai, China
T
Tao Liu
Academy for Engineering and Technology, Fudan University, Shanghai, China; Institute of Metaverse & Intelligent Medicine, Fudan University, Shanghai, China
X
Xing Wei
Academy for Engineering and Technology, Fudan University, Shanghai, China; Institute of Metaverse & Intelligent Medicine, Fudan University, Shanghai, China
Lihua Zhang
Lihua Zhang
Wuhan University
computational biologybioinformaticsdata mining