Longitudinal Analysis of Privacy Labels in the Apple App Store

📅 2022-06-06
🏛️ arXiv.org
📈 Citations: 10
Influential: 3
📄 PDF
🤖 AI Summary
This study investigates the adoption dynamics, stability, and factual accuracy of Apple’s App Store privacy nutrition labels. Method: Leveraging a two-year longitudinal web crawl (near-weekly sampling) of over 16,000 apps, combined with temporal annotation, statistical modeling, and category evolution analysis. Results: Two years post-introduction, only 70.1% of apps had adopted the labels, with new app submissions driving most adoption growth. While 41.8% of labeled apps claimed “no data collection,” 98% of labels remained unchanged after initial deployment—changes predominantly involved adding new data collection disclosures, indicating widespread “set-and-forget” behavior. Critically, this study provides the first empirical evidence that many “no collection” declarations stem from mandatory填报 rather than actual data practices, severely undermining label transparency and users’ capacity for informed decision-making.
📝 Abstract
In December of 2020, Apple started to require app developers to self-report privacy label annotations on their apps indicating what data is collected and how it is used.To understand the adoption and shifts in privacy labels in the App Store, we collected nearly weekly snapshots of over 1.6 million apps for over a year (July 15, 2021 -- October 25, 2022) to understand the dynamics of privacy label ecosystem. Nearly two years after privacy labels launched, only 70.1% of apps have privacy labels, but we observed an increase of 28% during the measurement period. Privacy label adoption rates are mostly driven by new apps rather than older apps coming into compliance. Of apps with labels, 18.1% collect data used to track users, 38.1% collect data that is linked to a user identity, and 42.0% collect data that is not linked. A surprisingly large share (41.8%) of apps with labels indicate that they do not collect any data, and while we do not perform direct analysis of the apps to verify this claim, we observe that it is likely that many of these apps are choosing a Does Not Collect label due to being forced to select a label, rather than this being the true behavior of the app. Moreover, for apps that have assigned labels during the measurement period nearly all do not change their labels, and when they do, the new labels indicate more data collection than less. This suggests that privacy labels may be a ``set once'' mechanism for developers that may not actually provide users with the clarity needed to make informed privacy decisions.
Problem

Research questions and friction points this paper is trying to address.

Analyzing adoption rates of Apple App Store privacy labels
Investigating accuracy of self-reported data collection claims
Assessing label update frequency and transparency for users
Innovation

Methods, ideas, or system contributions that make the work stand out.

Longitudinal analysis of 1.6M apps' privacy labels
Tracked 28% increase in label adoption rates
Identified 'set once' behavior in label updates
🔎 Similar Papers
No similar papers found.
D
David G. Balash
University of Richmond
M
M. M. Ali
University of Illinois Chicago
X
Xiaoyuan Wu
Carnegie Mellon University
Chris Kanich
Chris Kanich
University of Illinois at Chicago
Internet securitySocio-technical cybersecurityAttacker capabilities and motivationsInternet measurement
A
Adam J. Aviv
The George Washington University