🤖 AI Summary
This study addresses the challenge of matching nutritional goals to free-text dietary logs from low-income community users. We propose a domain-knowledge-enhanced machine learning classification framework that jointly integrates dietary ontologies, food entity recognition, ingredient parsing, and macronutrient information into both TF-IDF and BERT-based text representations, followed by multi-target classification using logistic regression and multilayer perceptrons. Compared to conventional self-reported user assessments, our method achieves significant improvements in classification performance—markedly increasing accuracy, precision, recall, and F1-score. The core contribution is the first automated analytical framework for real-world dietary text that deeply unifies nutrition science knowledge with deep semantic representations. This framework is scalable, interpretable, and provides robust technical support for precision nutrition interventions.
📝 Abstract
This study examined the use of machine learning and domain specific enrichment on patient generated health data, in the form of free text meal logs, to classify meals on alignment with different nutritional goals. We used a dataset of over 3000 meal records collected by 114 individuals from a diverse, low income community in a major US city using a mobile app. Registered dietitians provided expert judgement for meal to goal alignment, used as gold standard for evaluation. Using text embeddings, including TFIDF and BERT, and domain specific enrichment information, including ontologies, ingredient parsers, and macronutrient contents as inputs, we evaluated the performance of logistic regression and multilayer perceptron classifiers using accuracy, precision, recall, and F1 score against the gold standard and self assessment. Even without enrichment, ML outperformed self assessments of individuals who logged meals, and the best performing combination of ML classifier with enrichment achieved even higher accuracies. In general, ML classifiers with enrichment of Parsed Ingredients, Food Entities, and Macronutrients information performed well across multiple nutritional goals, but there was variability in the impact of enrichment and classification algorithm on accuracy of classification for different nutritional goals. In conclusion, ML can utilize unstructured free text meal logs and reliably classify whether meals align with specific nutritional goals, exceeding self assessments, especially when incorporating nutrition domain knowledge. Our findings highlight the potential of ML analysis of patient generated health data to support patient centered nutrition guidance in precision healthcare.