LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders

📅 2025-05-24

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Visual encoders suffer from insufficient adversarial robustness and degraded accuracy on clean samples, while existing supervised/unsupervised fine-tuning methods exhibit training instability in early stages and struggle to jointly optimize robustness and accuracy. To address this, we propose LORE, an unsupervised adversarial fine-tuning framework that introduces Lagrangian-constrained optimization into visual encoder robust training for the first time. LORE explicitly preserves clean-sample performance via embedding-space proximity constraints, thereby breaking the robustness–accuracy trade-off. Crucially, it requires no labels and leverages only the intrinsic architecture of the CLIP image encoder to enforce embedding-space regularization and adversarial perturbation constraints. Experiments demonstrate that LORE significantly improves adversarial robustness in zero-shot settings while maintaining near-lossless clean-sample accuracy. Moreover, it enhances out-of-distribution generalization and improves embedding interpretability.

Technology Category

Application Category

📝 Abstract

Visual encoders have become fundamental components in modern computer vision pipelines. However, ensuring robustness against adversarial perturbations remains a critical challenge. Recent efforts have explored both supervised and unsupervised adversarial fine-tuning strategies. We identify two key limitations in these approaches: (i) they often suffer from instability, especially during the early stages of fine-tuning, resulting in suboptimal convergence and degraded performance on clean data, and (ii) they exhibit a suboptimal trade-off between robustness and clean data accuracy, hindering the simultaneous optimization of both objectives. To overcome these challenges, we propose Lagrangian-Optimized Robust Embeddings (LORE), a novel unsupervised adversarial fine-tuning framework. LORE utilizes constrained optimization, which offers a principled approach to balancing competing goals, such as improving robustness while preserving nominal performance. By enforcing embedding-space proximity constraints, LORE effectively maintains clean data performance throughout adversarial fine-tuning. Extensive experiments show that LORE significantly improves zero-shot adversarial robustness with minimal degradation in clean data accuracy. Furthermore, we demonstrate the effectiveness of the adversarially fine-tuned CLIP image encoder in out-of-distribution generalization and enhancing the interpretability of image embeddings.

Problem

Research questions and friction points this paper is trying to address.

Ensuring robustness against adversarial perturbations in visual encoders

Addressing instability and suboptimal convergence in adversarial fine-tuning

Balancing robustness and clean data accuracy in unsupervised frameworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised adversarial fine-tuning framework

Constrained optimization for balanced robustness

Embedding-space proximity constraints maintain performance

🔎 Similar Papers

Law of Vision Representation in MLLMs