🤖 AI Summary
Traditional BMI measurement is infeasible in telemedicine and emergency scenarios, yet existing methods suffer from limited training data (≤14.5K images) and poor generalizability. To address this, we introduce WayBED—the first large-scale, real-world smartphone image dataset for BMI estimation, comprising 85K diverse images. We propose an automatic data filtering framework leveraging pose clustering and human detection to significantly enhance data quality and model robustness. Furthermore, we design CLAID, a lightweight end-to-end model that jointly integrates pose estimation, portrait detection, and CLIP adaptation, enabling on-device BMI inference on Android smartphones. On the WayBED test set, CLAID achieves a state-of-the-art 7.9% MAPE; after fine-tuning on the cross-domain VisualBodyToBMI dataset, it attains 8.56% MAPE—demonstrating strong generalization and clinical applicability.
📝 Abstract
Estimating Body Mass Index (BMI) from camera images with machine learning models enables rapid weight assessment when traditional methods are unavailable or impractical, such as in telehealth or emergency scenarios. Existing computer vision approaches have been limited to datasets of up to 14,500 images. In this study, we present a deep learning-based BMI estimation method trained on our WayBED dataset, a large proprietary collection of 84,963 smartphone images from 25,353 individuals. We introduce an automatic filtering method that uses posture clustering and person detection to curate the dataset by removing low-quality images, such as those with atypical postures or incomplete views. This process retained 71,322 high-quality images suitable for training. We achieve a Mean Absolute Percentage Error (MAPE) of 7.9% on our hold-out test set (WayBED data) using full-body images, the lowest value in the published literature to the best of our knowledge. Further, we achieve a MAPE of 13% on the completely unseen~(during training) VisualBodyToBMI dataset, comparable with state-of-the-art approaches trained on it, demonstrating robust generalization. Lastly, we fine-tune our model on VisualBodyToBMI and achieve a MAPE of 8.56%, the lowest reported value on this dataset so far. We deploy the full pipeline, including image filtering and BMI estimation, on Android devices using the CLAID framework. We release our complete code for model training, filtering, and the CLAID package for mobile deployment as open-source contributions.