January Food Benchmark (JFB): A Public Benchmark Dataset and Evaluation Suite for Multimodal Food Analysis

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current automated nutritional analysis is hindered by inconsistent evaluation criteria and the absence of real-world benchmark datasets. To address this, we introduce JFB—the first application-oriented, high-quality multimodal food benchmark comprising 1,000 real-world food images—accompanied by a standardized evaluation framework and a novel holistic scoring mechanism, thereby filling a critical gap in the field’s assessment infrastructure. We propose a robustness-aware evaluation metric grounded in vision-language models (VLMs) and a hierarchical testing protocol, and develop a dedicated model, january/food-vision-v1. On JFB, our model achieves an overall score of 86.2, outperforming the best general-purpose model by 12.1 points. This demonstrates the effectiveness and advancement of our benchmark, evaluation framework, and model design.

Technology Category

Application Category

📝 Abstract
Progress in AI for automated nutritional analysis is critically hampered by the lack of standardized evaluation methodologies and high-quality, real-world benchmark datasets. To address this, we introduce three primary contributions. First, we present the January Food Benchmark (JFB), a publicly available collection of 1,000 food images with human-validated annotations. Second, we detail a comprehensive benchmarking framework, including robust metrics and a novel, application-oriented overall score designed to assess model performance holistically. Third, we provide baseline results from both general-purpose Vision-Language Models (VLMs) and our own specialized model, january/food-vision-v1. Our evaluation demonstrates that the specialized model achieves an Overall Score of 86.2, a 12.1-point improvement over the best-performing general-purpose configuration. This work offers the research community a valuable new evaluation dataset and a rigorous framework to guide and benchmark future developments in automated nutritional analysis.
Problem

Research questions and friction points this paper is trying to address.

Lack of standardized evaluation for food analysis AI
Need high-quality real-world food benchmark datasets
Absence of holistic metrics for nutritional analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Public dataset with 1000 food images
Comprehensive benchmarking framework with metrics
Specialized model improves performance by 12.1 points
🔎 Similar Papers
No similar papers found.
Amir Hosseinian
Amir Hosseinian
AI Lead, January AI
BioinformaticsMachine LearningData Science
A
Ashkan Dehghani Zahedani
January AI
U
Umer Mansoor
January AI
N
Noosheen Hashemi
January AI
Mark Woodward
Mark Woodward
January AI