VVTRec: Radio Interferometric Reconstruction through Visual and Textual Modality Enrichment

📅 2026-01-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work proposes VVTRec, a novel approach to radio interferometric imaging that addresses the limitations of existing methods, which rely solely on sparse visibility data and often produce reconstructions with artifacts and weak semantic coherence. VVTRec is the first to integrate multimodal (vision + text) augmentation and leverage a pretrained vision-language model (VLM) for this task. By generating image and text embeddings guided by visibility data, the method effectively fuses spatial structural cues with high-level semantic information. Crucially, it enables training-free knowledge transfer from the pretrained VLM, avoiding additional computational overhead. Experimental results demonstrate that VVTRec significantly enhances reconstruction quality—improving image clarity, structural integrity, and semantic accuracy—without substantially increasing computational cost.

Technology Category

Application Category

📝 Abstract

Radio astronomy is an indispensable discipline for observing distant celestial objects. Measurements of wave signals from radio telescopes, called visibility, need to be transformed into images for astronomical observations. These dirty images blend information from real sources and artifacts. Therefore, astronomers usually perform reconstruction before imaging to obtain cleaner images. Existing methods consider only a single modality of sparse visibility data, resulting in images with remaining artifacts and insufficient modeling of correlation. To enhance the extraction of visibility information and emphasize output quality in the image domain, we propose VVTRec, a multimodal radio interferometric data reconstruction method with visibility-guided visual and textual modality enrichment. In our VVTRec, sparse visibility is transformed into image-form and text-form features to obtain enhancements in terms of spatial and semantic information, improving the structural integrity and accuracy of images. Also, we leverage Vision-Language Models (VLMs) to achieve additional training-free performance improvements. VVTRec enables sparse visibility, as a foreign modality unseen by VLMs, to accurately extract pre-trained knowledge as a supplement. Our experiments demonstrate that VVTRec effectively enhances imaging results by exploiting multimodal information without introducing excessive computational overhead.

Problem

Research questions and friction points this paper is trying to address.

radio interferometric reconstruction

visibility data

image artifacts

multimodal enrichment

astronomical imaging

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal reconstruction

visibility enrichment

Vision-Language Models