🤖 AI Summary
This study addresses the challenge of accurately estimating the three-dimensional volume of food from a single 2D image for caloric assessment. To this end, it systematically investigates and integrates multiple image-to-portion mapping strategies, including deep learning, monocular image analysis, depth map fusion, multi-view geometry, and template matching. Without relying on complex hardware, the proposed approach significantly enhances the accuracy of food volume and calorie estimation by comprehensively comparing and optimizing these diverse methods. The resulting framework offers a practical and high-precision solution for everyday dietary monitoring, thereby advancing the field of automated nutritional intake assessment based solely on ordinary images.
📝 Abstract
Reliance on images for dietary assessment is an important strategy to accurately and conveniently monitor an individual's health, making it a vital mechanism in the prevention and care of chronic diseases and obesity. However, image-based dietary assessment suffers from estimating the three dimensional size of food from 2D image inputs. Many strategies have been devised to overcome this critical limitation such as the use of auxiliary inputs like depth maps, multi-view inputs, or model-based approaches such as template matching. Deep learning also helps bridge the gap by either using monocular images or combinations of the image and the auxillary inputs to precisely predict the output portion from the image input. In this paper, we explore the different strategies employed for accurate portion estimation.