Fine-Grained Cat Breed Recognition with Global Context Vision Transformer

📅 2026-02-07

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

This study addresses the challenge of fine-grained image recognition in cat breed classification, where subtle variations in fur color and facial structure often hinder accurate discrimination. To tackle this issue, the work proposes the first application of the Global Context Vision Transformer (GCViT-Tiny) to this domain. Leveraging a subset of the Oxford-IIIT Pet dataset, the approach incorporates data augmentation techniques—including rotation, horizontal flipping, and brightness adjustment—to effectively model global contextual information and enhance discriminative capacity. The proposed model achieves classification accuracies of 94.54% on the validation set and 92.00% on the test set, demonstrating the effectiveness and superiority of GCViT for fine-grained visual categorization tasks.

Technology Category

Application Category

📝 Abstract

Accurate identification of cat breeds from images is a challenging task due to subtle differences in fur patterns, facial structure, and color. In this paper, we present a deep learning-based approach for classifying cat breeds using a subset of the Oxford-IIIT Pet Dataset, which contains high-resolution images of various domestic breeds. We employed the Global Context Vision Transformer (GCViT) architecture-tiny for cat breed recognition. To improve model generalization, we used extensive data augmentation, including rotation, horizontal flipping, and brightness adjustment. Experimental results show that the GCViT-Tiny model achieved a test accuracy of 92.00% and validation accuracy of 94.54%. These findings highlight the effectiveness of transformer-based architectures for fine-grained image classification tasks. Potential applications include veterinary diagnostics, animal shelter management, and mobile-based breed recognition systems. We also provide a hugging face demo at https://huggingface.co/spaces/bfarhad/cat-breed-classifier.

Problem

Research questions and friction points this paper is trying to address.

fine-grained classification

cat breed recognition

image classification

visual recognition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Global Context Vision Transformer

fine-grained classification

cat breed recognition

data augmentation