🤖 AI Summary
To address copyright infringement and transparency concerns arising from unauthorized use of third-party data in machine learning model training, this paper proposes the first general-purpose, task-agnostic data usage auditing framework for black-box models. Methodologically, it innovatively integrates arbitrary black-box membership inference techniques with a custom sequential probability ratio test (SPRT), enabling zero assumptions about downstream tasks, strict control over false positive rates (tunable within 0.5%–5%), and cross-model generalization. The framework features a model-agnostic interface, supporting heterogeneous architectures including image classifiers and multimodal large language models. Extensive experiments on ImageNet classifiers and multimodal foundation models demonstrate an average detection accuracy exceeding 92%, with false positive rates consistently meeting user-specified thresholds. This work significantly enhances the quantifiability and reliability of training data provenance auditing.
📝 Abstract
Auditing the use of data in training machine-learning (ML) models is an increasingly pressing challenge, as myriad ML practitioners routinely leverage the effort of content creators to train models without their permission. In this paper, we propose a general method to audit an ML model for the use of a data-owner's data in training, without prior knowledge of the ML task for which the data might be used. Our method leverages any existing black-box membership inference method, together with a sequential hypothesis test of our own design, to detect data use with a quantifiable, tunable false-detection rate. We show the effectiveness of our proposed framework by applying it to audit data use in two types of ML models, namely image classifiers and foundation models.