SoK: Analysis of Privacy Risks and Mitigation in Online Propaganda Detection through the PROMPT Framework

📅 2026-04-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
This study addresses significant privacy risks in online promotional content detection pipelines—particularly during data collection, feature extraction, and model inference—which commonly fail to comply with regulations such as GDPR and CCPA. Through a systematic review of 162 publications, the authors propose PROMPT, a novel framework that formally maps privacy risks to corresponding defense strategies for the first time. The framework incorporates a compliance scoring mechanism and a utility function to quantitatively balance privacy preservation, model performance, and deployment cost. Experiments fine-tuning a Transformer-based encoder-decoder architecture under synthetic perturbations show that at perturbation parameter q=0.05, F₁ scores decline by only 1–2%, whereas at q=0.20, the drop reaches 13–14%. The analysis further reveals widespread compliance deficiencies in existing approaches regarding metadata handling and user-level aggregation.

Technology Category

Application Category

📝 Abstract
Online propaganda detection pipelines expose measurable privacy risks at multiple stages including data collection, feature extraction, and model inference. We conduct a structured analysis of $162$ peer-reviewed studies and formalize the problem using the Propaganda Risk Online Mitigation and Privacy-preserving Tactics (PROMPT) framework. PROMPT models risks $R$ and mitigation strategies $S$ through a mapping $M: R\to S$ guided by a utility function $α\cdot \mathrm{PrivacyGain}(s_j) - β\cdot \mathrm{PerfLoss}(s_j) - γ\cdot \mathrm{Cost}(s_j)$, with tunable $(α,β,γ)$ enabling stakeholders to balance privacy, accuracy, and deployment costs. To assess practical adoption, we introduce a compliance score that quantifies the alignment of existing methods with GDPR, CCPA etc. requirements. Our evaluation shows that many widely used pipelines remain non-compliant, particularly in metadata handling and user-level aggregation. We further present empirical fine-tuning experiments on transformer-based encoders and decoders under synthetic perturbation, demonstrating a monotonic privacy-utility trade-off: with $q = 0.05$ performance decreased by 1-2% F$_1$, while at $q = 0.20$ the reduction reached 13-14%. These results establish quantitative baselines for privacy costs in propaganda detection. Our contributions include a formal risk-to-defense mapping, a compliance-oriented auditing metric, and experimental evidence of privacy-performance trade-offs, providing a technical foundation for building regulation-compliant and privacy-aware detection systems.
Problem

Research questions and friction points this paper is trying to address.

privacy risks
online propaganda detection
data compliance
privacy-utility trade-off
metadata handling
Innovation

Methods, ideas, or system contributions that make the work stand out.

PROMPT framework
privacy-utility trade-off
compliance score
propaganda detection
privacy-preserving machine learning