🤖 AI Summary
Current automated auditing tools for GDPR/CCPA compliance lack systematic support for diverse web-based consent forms—extending beyond cookie banners—to verify core legal requirements such as freedom of choice, purpose specification, and ease of withdrawal.
Method: This paper introduces Cosmic, the first end-to-end automated framework for detecting GDPR consent violations across heterogeneous web forms. Cosmic integrates DOM parsing, OCR-enhanced form recognition, legal semantic modeling, and structured joint reasoning over form elements to enable interpretable, requirement-specific validation.
Contribution/Results: Evaluated on 5,823 websites and 3,598 consent forms, Cosmic achieves a true positive rate (TPR) of 98.6% for consent-form detection and 99.1% for violation identification, covering 94.1% of identified consent forms and detecting 3,384 distinct violations. Cosmic fills a critical gap in form-level consent auditing and establishes a novel paradigm for automated, explainable privacy compliance assessment.
📝 Abstract
Recent privacy regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) have established legal requirements for obtaining user consent regarding the collection, use, and sharing of personal data. These regulations emphasize that consent must be informed, freely given, specific, and unambiguous. However, there are still many violations, which highlight a gap between legal expectations and actual implementation. Consent mechanisms embedded in functional web forms across websites play a critical role in ensuring compliance with data protection regulations such as the GDPR and CCPA, as well as in upholding user autonomy and trust. However, current research has primarily focused on cookie banners and mobile app dialogs. These forms are diverse in structure, vary in legal basis, and are often difficult to locate or evaluate, creating a significant challenge for automated consent compliance auditing. In this work, we present Cosmic, a novel automated framework for detecting consent-related privacy violations in web forms. We evaluate our developed tool for auditing consent compliance in web forms, across 5,823 websites and 3,598 forms. Cosmic detects 3,384 violations on 94.1% of consent forms, covering key GDPR principles such as freely given consent, purpose disclosure, and withdrawal options. It achieves 98.6% and 99.1% TPR for consent and violation detection, respectively, demonstrating high accuracy and real-world applicability.