🤖 AI Summary
To address insufficient transparency in machine learning training data, this paper proposes and implements the first publicly demonstrable membership inference testing platform, enabling empirical determination of whether a specific sample was included in model training. Methodologically, we design a membership inference framework grounded in statistical significance testing and black-box model behavior analysis, systematically validated on a facial image dataset exceeding 22 million samples. By integrating heterogeneous facial data sources and mainstream recognition models, our framework supports cross-model generalization evaluation. Experiments achieve up to 89% membership identification accuracy across multiple publicly available face recognition models. This work represents the first engineering realization of membership inference as a reproducible, auditable open platform—establishing a novel paradigm for traceability and regulatory compliance verification in AI training processes.
📝 Abstract
We present the Membership Inference Test Demonstrator, to emphasize the need for more transparent machine learning training processes. MINT is a technique for experimentally determining whether certain data has been used during the training of machine learning models. We conduct experiments with popular face recognition models and 5 public databases containing over 22M images. Promising results, up to 89% accuracy are achieved, suggesting that it is possible to recognize if an AI model has been trained with specific data. Finally, we present a MINT platform as demonstrator of this technology aimed to promote transparency in AI training.