Evidential Transformers for Improved Image Retrieval

📅 2024-09-02

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address the limited robustness of models in content-based image retrieval (CBIR) and their suboptimal performance in cross-domain and fine-grained retrieval, this paper proposes an uncertainty-driven evidential Transformer. It introduces evidential deep learning—previously unexplored in deep metric learning—enabling interpretable uncertainty quantification as a principled alternative to conventional multi-class classification. The method integrates Global Context Vision Transformer (GC ViT) to capture holistic contextual dependencies for discriminative feature representation. Evaluated on Stanford Online Products and CUB-200-2011, the approach establishes new state-of-the-art results across all standard retrieval protocols—including recall@K, NMI, and F1-score—demonstrating substantial improvements in retrieval reliability, cross-domain generalization, and fine-grained discrimination capability.

Technology Category

Application Category

📝 Abstract

We introduce the Evidential Transformer, an uncertainty-driven transformer model for improved and robust image retrieval. In this paper, we make several contributions to content-based image retrieval (CBIR). We incorporate probabilistic methods into image retrieval, achieving robust and reliable results, with evidential classification surpassing traditional training based on multiclass classification as a baseline for deep metric learning. Furthermore, we improve the state-of-the-art retrieval results on several datasets by leveraging the Global Context Vision Transformer (GC ViT) architecture. Our experimental results consistently demonstrate the reliability of our approach, setting a new benchmark in CBIR in all test settings on the Stanford Online Products (SOP) and CUB-200-2011 datasets.

Problem

Research questions and friction points this paper is trying to address.

Improving image retrieval with uncertainty-driven transformer

Incorporating probabilistic methods for robust results

Enhancing state-of-the-art retrieval on benchmark datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evidential Transformer for robust image retrieval

Probabilistic methods enhance reliability in CBIR

GC ViT architecture improves state-of-the-art results

🔎 Similar Papers

No similar papers found.