FEAST: Retrieval-Augmented Multi-Hierarchical Food Classification for the FoodEx2 System

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of multi-level food classification in the FoodEx2 system, where complex label hierarchies, data sparsity, and extremely high output dimensionality hinder performance. To this end, the authors propose FEAST, a novel framework featuring the first practical three-stage hierarchical architecture comprising base-term recognition, multi-label facet prediction, and facet descriptor assignment. By integrating retrieval-augmented learning with deep metric learning and leveraging hierarchical structure to guide training, FEAST substantially enhances generalization on sparse and fine-grained labels. Evaluated on multilingual FoodEx2 benchmarks, FEAST achieves a significant 12%–38% improvement in F1 score over existing CNN baselines on rare categories, demonstrating its effectiveness and innovation in high-dimensional multi-label food classification.

Technology Category

Application Category

📝 Abstract
Hierarchical text classification (HTC) and extreme multi-label classification (XML) tasks face compounded challenges from complex label interdependencies, data sparsity, and extreme output dimensions. These challenges are exemplified in the European Food Safety Authority's FoodEx2 system-a standardized food classification framework essential for food consumption monitoring and contaminant exposure assessment across Europe. FoodEx2 coding transforms natural language food descriptions into a set of codes from multiple standardized hierarchies, but faces implementation barriers due to its complex structure. Given a food description (e.g., "organic yogurt''), the system identifies its base term ("yogurt''), all the applicable facet categories (e.g., "production method''), and then, every relevant facet descriptors to each category (e.g., "organic production''). While existing models perform adequately on well-balanced and semantically dense hierarchies, no work has been applied on the practical constraints imposed by the FoodEx2 system. The limited literature addressing such real-world scenarios further compounds these challenges. We propose FEAST (Food Embedding And Semantic Taxonomy), a novel retrieval-augmented framework that decomposes FoodEx2 classification into a three-stage approach: (1) base term identification, (2) multi-label facet prediction, and (3) facet descriptor assignment. By leveraging the system's hierarchical structure to guide training and performing deep metric learning, FEASTlearns discriminative embeddings that mitigate data sparsity and improve generalization on rare and fine-grained labels. Evaluated on the multilingual FoodEx2 benchmark, FEAST outperforms the prior European's CNN baseline F1 scores by 12-38 % on rare classes.
Problem

Research questions and friction points this paper is trying to address.

hierarchical text classification
extreme multi-label classification
FoodEx2
data sparsity
label interdependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

retrieval-augmented learning
hierarchical text classification
extreme multi-label classification
deep metric learning
FoodEx2
🔎 Similar Papers
No similar papers found.
L
Lorenzo Molfetta
Department of Computer Science and Engineering, University of Bologna, Cesena Campus, Via dell'Università 50, I-47522 Cesena, Italy
A
Alessio Cocchieri
Department of Computer Science and Engineering, University of Bologna, Cesena Campus, Via dell'Università 50, I-47522 Cesena, Italy
S
Stefano Fantazzini
Department of Computer Science and Engineering, University of Bologna, Cesena Campus, Via dell'Università 50, I-47522 Cesena, Italy
G
Giacomo Frisoni
Department of Computer Science and Engineering, University of Bologna, Cesena Campus, Via dell'Università 50, I-47522 Cesena, Italy
L
Luca Ragazzi
Department of Computer Science and Engineering, University of Bologna, Cesena Campus, Via dell'Università 50, I-47522 Cesena, Italy
Gianluca Moro
Gianluca Moro
Dept. of Computer Science and Engineering - University of Bologna, Cesena
natural language processingdata sciencedata miningmachine learningsensor networks agents peer-to-peer systems