How to Choose Your Teacher for Fine Grained Image Recognition

📅 2026-05-15

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This work addresses the challenge of selecting effective teacher models for knowledge distillation in fine-grained image recognition by proposing a novel teacher selection metric, Ratio 1-2, which is based on the probability ratio between the top two predicted classes from the teacher model. Through systematic evaluation across multiple datasets involving over a thousand teacher–student model pairs, the study demonstrates that Ratio 1-2 significantly improves teacher selection accuracy by 18% compared to existing methods. Consequently, student networks trained with teachers selected via this metric achieve up to a 17% improvement in classification accuracy across several benchmarks, thereby advancing the practicality of model compression and knowledge distillation in fine-grained visual recognition tasks.

📝 Abstract

Fine-grained image recognition classifies subcategories such as bird species or car models. While state-of-the-art (SOTA) models are accurate, they are often too resource-intensive for deployment on constrained devices. Knowledge distillation addresses this by transferring knowledge from a large teacher model to a smaller student model. A key challenge is selecting the right teacher, as it heavily impacts student performance. This paper introduces a teacher selection metric, \textbf{Ratio 1-2}, based on teacher prediction ratios. Extensive analysis of over one thousand experiments across 3 students, 8 teachers, and 8 datasets under 4 training strategies demonstrates that our metric improves teacher selection by 18\% over previous methods, enabling small student models to achieve up to 17\% accuracy gains. Experiment codebase is available at: \href{https://github.com/arkel23/FGIR-KD-Teacher}{https://github.com/arkel23/FGIR-KD-Teacher}.

Problem

Research questions and friction points this paper is trying to address.

fine-grained image recognition

knowledge distillation

teacher selection

model compression

student-teacher learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge distillation

teacher selection

fine-grained image recognition