Image Categorization and Search via a GAT Autoencoder and Representative Models

📅 2025-10-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address semantic representation inconsistency and weak inter-class discriminability in image classification and retrieval, this paper proposes a graph attention network (GAT)-based autoencoder framework for learning representative embeddings. The method constructs a two-level graph comprising image nodes and category nodes, models fine-grained structural relationships via a similarity-based graph, and employs a GAT-based autoencoder to learn context-aware and class-aware latent embeddings in an end-to-end manner. Its key innovation lies in explicitly incorporating category priors into the graph topology design and jointly optimizing image-level matching and category-level discriminative objectives. Extensive experiments on standard benchmarks demonstrate that the proposed method significantly outperforms conventional handcrafted features and state-of-the-art embedding approaches, achieving new state-of-the-art performance in both classification accuracy and retrieval effectiveness (measured by mean Average Precision, mAP).

Technology Category

Application Category

📝 Abstract
We propose a method for image categorization and retrieval that leverages graphs and a graph attention network (GAT)-based autoencoder. Our approach is representative-centric, that is, we execute the categorization and retrieval process via the representative models we construct for the images and image categories. We utilize a graph where nodes represent images (or their representatives) and edges capture similarity relationships. GAT highlights important features and relationships between images, enabling the autoencoder to construct context-aware latent representations that capture the key features of each image relative to its neighbors. We obtain category representatives from these embeddings and categorize a query image by comparing its representative to the category representatives. We then retrieve the most similar image to the query image within its identified category. We demonstrate the effectiveness of our representative-centric approach through experiments with both the GAT autoencoders and standard feature-based techniques.
Problem

Research questions and friction points this paper is trying to address.

Image categorization and retrieval using graph attention networks
Constructing representative models for images and categories
Comparing query image representatives to category representatives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses GAT autoencoder for image representation learning
Constructs representative models for image categorization
Leverages graph attention to capture similarity relationships
🔎 Similar Papers
D
Duygu Sap
CAMaCS, Mathematics Institute, University of Warwick, Coventry, United Kingdom
Martin Lotz
Martin Lotz
Mathematical Institute, University of Warwick
Mathematics
C
Connor Mattinson
TRUSS, London, United Kingdom