🤖 AI Summary
Commercial PII deletion services (e.g., DeleteMe, Incogni) claim to remove users’ personal information from data broker databases, yet their efficacy remains unverified by independent empirical evaluation. Method: This work introduces the first large-scale, user-driven evaluation framework—integrating real subscriptions, manual annotation, web scraping for ground-truth comparison, and textual analysis of service claims—to assess coverage, PII identification accuracy, and deletion effectiveness across major services. Results: Only 41.1% of records flagged as PII by services corresponded to users’ actual identities; among verified true PII records, only 48.2% were successfully removed; and all services covered far fewer data brokers than advertised. The study uncovers systemic deficiencies across three dimensions—coverage breadth, classification precision, and operational efficacy—and establishes a reproducible methodology for evaluating privacy-enhancing technologies.
📝 Abstract
This paper presents the first large-scale empirical study of commercial personally identifiable information (PII) removal systems -- commercial services that claim to improve privacy by automating the removal of PII from data broker's databases. Popular examples of such services include DeleteMe, Mozilla Monitor, Incogni, among many others. The claims these services make may be very appealing to privacy-conscious Web users, but how effective these services actually are at improving privacy has not been investigated. This work aims to improve our understanding of commercial PII removal services in multiple ways. First, we conduct a user study where participants purchase subscriptions from four popular PII removal services, and report (i) what PII the service find, (ii) from which data brokers, (iii) whether the service is able to have the information removed, and (iv) whether the identified information actually is PII describing the participant. And second, by comparing the claims and promises the services makes (e.g. which and how many data brokers each service claims to cover). We find that these services have significant accuracy and coverage issues that limit the usefulness of these services as a privacy-enhancing technology. For example, we find that the measured services are unable to remove the majority of the identified PII records from data broker's (48.2% of the successfully removed found records) and that most records identified by these services are not PII about the user (study participants found that only 41.1% of records identified by these services were PII about themselves).