Benchmarking Post-Hoc Unknown-Category Detection in Food Recognition

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the critical challenge in food recognition models—namely, their inability to reliably distinguish in-distribution (ID) classes from out-of-distribution (OOD) samples in real-world applications such as automated dietary assessment, often leading to erroneous ID classifications of OOD inputs. We conduct the first empirical study on post-hoc OOD detection specifically for fine-grained food recognition. We systematically evaluate state-of-the-art methods—including ViM (Virtual Logit Matching)—and find ViM achieves superior performance across standard OOD detection metrics (e.g., AUROC and FPR95). We further uncover a strong positive correlation between ID classification accuracy and OOD detection capability, and demonstrate that Transformer-based architectures consistently outperform CNN baselines under all evaluated OOD detection methods. Our findings enhance the robustness and safety of food recognition systems in open-world settings and provide a practical, deployable solution for unknown-category identification.

Technology Category

Application Category

📝 Abstract
Food recognition models often struggle to distinguish between seen and unseen samples, frequently misclassifying samples from unseen categories by assigning them an in-distribution (ID) label. This misclassification presents significant challenges when deploying these models in real-world applications, particularly within automatic dietary assessment systems, where incorrect labels can lead to cascading errors throughout the system. Ideally, such models should prompt the user when an unknown sample is encountered, allowing for corrective action. Given no prior research exploring food recognition in real-world settings, in this work we conduct an empirical analysis of various post-hoc out-of-distribution (OOD) detection methods for fine-grained food recognition. Our findings indicate that virtual logit matching (ViM) performed the best overall, likely due to its combination of logits and feature-space representations. Additionally, our work reinforces prior notions in the OOD domain, noting that models with higher ID accuracy performed better across the evaluated OOD detection methods. Furthermore, transformer-based architectures consistently outperformed convolution-based models in detecting OOD samples across various methods.
Problem

Research questions and friction points this paper is trying to address.

Detecting unseen food categories in recognition models
Improving OOD detection for real-world dietary assessment
Evaluating post-hoc methods for fine-grained food recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical analysis of post-hoc OOD detection methods
Virtual logit matching (ViM) combines logits and features
Transformer-based models outperform convolution-based in OOD detection
🔎 Similar Papers
No similar papers found.