LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization

📅 2025-03-16
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
Text-to-image latent diffusion models (LDMs) face bottlenecks in imaging inverse problems, including heavy reliance on hand-crafted prompts and high computational overhead. Method: We propose LATINO, the first zero-shot, plug-and-play (PnP) framework built upon latent consistency models (LCMs). It is the first to embed LCMs into stochastic inverse solvers; introduces a gradient-free conditional guidance mechanism; and proposes an edge-aware maximum-likelihood prompt self-calibration paradigm that enables end-to-end optimization of text priors directly from observed data. Contribution/Results: LATINO achieves state-of-the-art (SOTA) reconstruction quality within only eight neural function evaluations. It significantly reduces memory footprint and computational cost. Extensive experiments demonstrate new SOTA performance across diverse inverse imaging tasks—including compressive sensing, denoising, and super-resolution—without task-specific retraining or fine-tuning.

Technology Category

Application Category

📝 Abstract
Text-to-image latent diffusion models (LDMs) have recently emerged as powerful generative models with great potential for solving inverse problems in imaging. However, leveraging such models in a Plug&Play (PnP), zero-shot manner remains challenging because it requires identifying a suitable text prompt for the unknown image of interest. Also, existing text-to-image PnP approaches are highly computationally expensive. We herein address these challenges by proposing a novel PnP inference paradigm specifically designed for embedding generative models within stochastic inverse solvers, with special attention to Latent Consistency Models (LCMs), which distill LDMs into fast generators. We leverage our framework to propose LAtent consisTency INverse sOlver (LATINO), the first zero-shot PnP framework to solve inverse problems with priors encoded by LCMs. Our conditioning mechanism avoids automatic differentiation and reaches SOTA quality in as little as 8 neural function evaluations. As a result, LATINO delivers remarkably accurate solutions and is significantly more memory and computationally efficient than previous approaches. We then embed LATINO within an empirical Bayesian framework that automatically calibrates the text prompt from the observed measurements by marginal maximum likelihood estimation. Extensive experiments show that prompt self-calibration greatly improves estimation, allowing LATINO with PRompt Optimization to define new SOTAs in image reconstruction quality and computational efficiency.
Problem

Research questions and friction points this paper is trying to address.

Challenges in zero-shot text-to-image inverse problem solving.
High computational cost of existing text-to-image PnP methods.
Need for automatic text prompt calibration in image reconstruction.
Innovation

Methods, ideas, or system contributions that make the work stand out.

LATINO: zero-shot PnP framework for inverse problems
Leverages Latent Consistency Models for fast generation
Self-calibrates text prompts via empirical Bayesian framework
🔎 Similar Papers
No similar papers found.
A
Alessio Spagnoletti
Université Paris Cité, MAP5 UMR 8145, F-75006 Paris, France
J
Jean Prost
UniversitĂ© de Lille, CRIStAL UMR 9189, F-59655 Villeneuve d’Ascq, France
A
Andrés Almansa
Université Paris Cité, MAP5 UMR 8145, F-75006 Paris, France
Nicolas Papadakis
Nicolas Papadakis
CNRS, Institut de Mathématiques de Bordeaux, Inria MONC
Image ProcessingOptimal transportMachine learningData assimilation
Marcelo Pereyra
Marcelo Pereyra
Heriot Watt University, School of Mathematical and Computer Sciences
Bayesian analysis and computationimaging inverse problemsstatistical image processingMarkov chain Monte Carlo algorithms