January Food Benchmark (JFB): A Public Benchmark Dataset and Evaluation Suite for Multimodal Food Analysis

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Current automated nutritional analysis is hindered by inconsistent evaluation criteria and the absence of real-world benchmark datasets. To address this, we introduce JFB—the first application-oriented, high-quality multimodal food benchmark comprising 1,000 real-world food images—accompanied by a standardized evaluation framework and a novel holistic scoring mechanism, thereby filling a critical gap in the field’s assessment infrastructure. We propose a robustness-aware evaluation metric grounded in vision-language models (VLMs) and a hierarchical testing protocol, and develop a dedicated model, january/food-vision-v1. On JFB, our model achieves an overall score of 86.2, outperforming the best general-purpose model by 12.1 points. This demonstrates the effectiveness and advancement of our benchmark, evaluation framework, and model design.

Technology Category

Application Category

📝 Abstract

Progress in AI for automated nutritional analysis is critically hampered by the lack of standardized evaluation methodologies and high-quality, real-world benchmark datasets. To address this, we introduce three primary contributions. First, we present the January Food Benchmark (JFB), a publicly available collection of 1,000 food images with human-validated annotations. Second, we detail a comprehensive benchmarking framework, including robust metrics and a novel, application-oriented overall score designed to assess model performance holistically. Third, we provide baseline results from both general-purpose Vision-Language Models (VLMs) and our own specialized model, january/food-vision-v1. Our evaluation demonstrates that the specialized model achieves an Overall Score of 86.2, a 12.1-point improvement over the best-performing general-purpose configuration. This work offers the research community a valuable new evaluation dataset and a rigorous framework to guide and benchmark future developments in automated nutritional analysis.

Problem

Research questions and friction points this paper is trying to address.

Lack of standardized evaluation for food analysis AI

Need high-quality real-world food benchmark datasets

Absence of holistic metrics for nutritional analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Public dataset with 1000 food images

Comprehensive benchmarking framework with metrics

Specialized model improves performance by 12.1 points

🔎 Similar Papers

No similar papers found.