🤖 AI Summary
Existing skin cancer detection datasets lack anatomical context and individualized clinical metadata, limiting model robustness in lesion localization and generalizability to real-world clinical settings. To address this, we introduce the first context-aware dataset for early skin cancer screening: comprising 16,954 high-resolution (7×9 cm) skin region images acquired via 3D full-body photography. Each sample is annotated with lesion bounding boxes, precise anatomical locations, age group, and solar damage scores—structured as rich, standardized metadata. Departing from conventional single-lesion center-cropping paradigms, our dataset provides grid-based, anatomy-aligned region-level annotations, enabling end-to-end lesion detection modeling. The dataset is publicly released under an open license. Evaluation demonstrates substantial improvements in localization accuracy and clinical interpretability of AI-powered screening systems—particularly in non-standard, resource-limited settings. This work establishes a new benchmark for context-aware, clinically grounded skin cancer AI research.
📝 Abstract
Artificial intelligence has significantly advanced skin cancer diagnosis by enabling rapid and accurate detection of malignant lesions. In this domain, most publicly available image datasets consist of single, isolated skin lesions positioned at the center of the image. While these lesion-centric datasets have been fundamental for developing diagnostic algorithms, they lack the context of the surrounding skin, which is critical for improving lesion detection. The iToBoS dataset was created to address this challenge. It includes 16,954 images of skin regions from 100 participants, captured using 3D total body photography. Each image roughly corresponds to a $7 imes 9$ cm section of skin with all suspicious lesions annotated using bounding boxes. Additionally, the dataset provides metadata such as anatomical location, age group, and sun damage score for each image. This dataset aims to facilitate training and benchmarking of algorithms, with the goal of enabling early detection of skin cancer and deployment of this technology in non-clinical environments.