Image Categorization and Search via a GAT Autoencoder and Representative Models

📅 2025-10-18

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

To address semantic representation inconsistency and weak inter-class discriminability in image classification and retrieval, this paper proposes a graph attention network (GAT)-based autoencoder framework for learning representative embeddings. The method constructs a two-level graph comprising image nodes and category nodes, models fine-grained structural relationships via a similarity-based graph, and employs a GAT-based autoencoder to learn context-aware and class-aware latent embeddings in an end-to-end manner. Its key innovation lies in explicitly incorporating category priors into the graph topology design and jointly optimizing image-level matching and category-level discriminative objectives. Extensive experiments on standard benchmarks demonstrate that the proposed method significantly outperforms conventional handcrafted features and state-of-the-art embedding approaches, achieving new state-of-the-art performance in both classification accuracy and retrieval effectiveness (measured by mean Average Precision, mAP).

Technology Category

Application Category

📝 Abstract

We propose a method for image categorization and retrieval that leverages graphs and a graph attention network (GAT)-based autoencoder. Our approach is representative-centric, that is, we execute the categorization and retrieval process via the representative models we construct for the images and image categories. We utilize a graph where nodes represent images (or their representatives) and edges capture similarity relationships. GAT highlights important features and relationships between images, enabling the autoencoder to construct context-aware latent representations that capture the key features of each image relative to its neighbors. We obtain category representatives from these embeddings and categorize a query image by comparing its representative to the category representatives. We then retrieve the most similar image to the query image within its identified category. We demonstrate the effectiveness of our representative-centric approach through experiments with both the GAT autoencoders and standard feature-based techniques.

Problem

Research questions and friction points this paper is trying to address.

Image categorization and retrieval using graph attention networks

Constructing representative models for images and categories

Comparing query image representatives to category representatives

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses GAT autoencoder for image representation learning

Constructs representative models for image categorization

Leverages graph attention to capture similarity relationships

🔎 Similar Papers

Evidential Transformers for Improved Image Retrieval