Size Matters: Reconstructing Real-Scale 3D Models from Monocular Images for Food Portion Estimation

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing monocular 3D food reconstruction methods struggle to recover physically accurate scale, limiting the precision of dietary intake estimation. This work introduces, for the first time, real-scale recovery into monocular food 3D reconstruction by integrating visual features extracted from large-scale pre-trained models to enable end-to-end reconstruction with metrically accurate volume estimation. By bridging the critical gap between 3D vision and precision nutrition, the proposed method achieves a nearly 30% reduction in average volume estimation error on two public datasets, significantly outperforming current state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
The rise of chronic diseases related to diet, such as obesity and diabetes, emphasizes the need for accurate monitoring of food intake. While AI-driven dietary assessment has made strides in recent years, the ill-posed nature of recovering size (portion) information from monocular images for accurate estimation of ``how much did you eat?'' is a pressing challenge. Some 3D reconstruction methods have achieved impressive geometric reconstruction but fail to recover the crucial real-world scale of the reconstructed object, limiting its usage in precision nutrition. In this paper, we bridge the gap between 3D computer vision and digital health by proposing a method that recovers a true-to-scale 3D reconstructed object from a monocular image. Our approach leverages rich visual features extracted from models trained on large-scale datasets to estimate the scale of the reconstructed object. This learned scale enables us to convert single-view 3D reconstructions into true-to-life, physically meaningful models. Extensive experiments and ablation studies on two publicly available datasets show that our method consistently outperforms existing techniques, achieving nearly a 30% reduction in mean absolute volume-estimation error, showcasing its potential to enhance the domain of precision nutrition. Code: https://gitlab.com/viper-purdue/size-matters
Problem

Research questions and friction points this paper is trying to address.

food portion estimation
monocular 3D reconstruction
real-world scale
size recovery
precision nutrition
Innovation

Methods, ideas, or system contributions that make the work stand out.

true-to-scale 3D reconstruction
monocular image
food portion estimation
scale recovery
precision nutrition
🔎 Similar Papers
No similar papers found.
Gautham Vinod
Gautham Vinod
PhD Candidate in ECE, Purdue University
Computer VisionSmart HealthImage ProcessingDeep Learning
B
Bruce Coburn
Purdue University, West Lafayette, Indiana, U.S.A.
S
Siddeshwar Raghavan
Purdue University, West Lafayette, Indiana, U.S.A.
Jiangpeng He
Jiangpeng He
Purdue University
Computer VisionDeep Learning
F
Fengqing Zhu
Purdue University, West Lafayette, Indiana, U.S.A.