Tracker Installations Are Not Created Equal: Understanding Tracker Configuration of Form Data Collection

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study identifies, for the first time, systematic privacy risks arising from default configurations of Google and Meta trackers that automatically collect personally identifiable information (PII) via web forms. Method: Leveraging a multi-method approach—including web frontend reverse engineering, large-scale automated crawling, form interaction detection, policy rule matching, and HTTP attribution measurement—we empirically analyzed 40,150 websites. Contribution/Results: We find that although Meta’s tracker coverage is low (28.2%), 62.3% of embedded instances enable PII collection; conversely, Google’s coverage is high (72.6%) but only 11.6% enable such collection. Both platforms embed PII collection into baseline installation flows, misuse cryptographic hashing to falsely signal privacy compliance, issue misleading disclosures, and violate their own platform policies. We identify severe policy violations across dozens of financial and healthcare sites and establish a causal chain linking tracker configuration choices to end-user PII leakage—providing empirical grounding and methodological frameworks for regulatory intervention and privacy-enhancing design.

Technology Category

Application Category

📝 Abstract
Targeted advertising is fueled by the comprehensive tracking of users'online activity. As a result, advertising companies, such as Google and Meta, encourage website administrators to not only install tracking scripts on their websites but configure them to automatically collect users'Personally Identifying Information (PII). In this study, we aim to characterize how Google and Meta's trackers can be configured to collect PII data from web forms. We first perform a qualitative analysis of how third parties present form data collection to website administrators in the documentation and user interface. We then perform a measurement study of 40,150 websites to quantify the prevalence and configuration of Google and Meta trackers. Our results reveal that both Meta and Google encourage the use of form data collection and include inaccurate statements about hashing PII as a privacy-preserving method. Additionally, we find that Meta includes configuring form data collection as part of the basic setup flow. Our large-scale measurement study reveals that while Google trackers are more prevalent than Meta trackers (72.6% vs. 28.2% of websites), Meta trackers are configured to collect form data more frequently (11.6% vs. 62.3%). Finally, we identify sensitive finance and health websites that have installed trackers that are likely configured to collect form data PII in violation of Meta and Google policies. Our study highlights how tracker documentation and interfaces can potentially play a role in users'privacy through the configuration choices made by the website administrators who install trackers.
Problem

Research questions and friction points this paper is trying to address.

Analyze Google and Meta tracker configurations for PII collection
Measure prevalence of form data collection on 40150 websites
Identify policy violations in sensitive finance and health sites
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyze Google and Meta tracker configurations
Measure form data collection prevalence
Identify PII collection policy violations
🔎 Similar Papers
No similar papers found.