SST-Guard: Detecting and Characterizing Server-Side Google Analytics in the Wild

📅 2026-04-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

173K/year
🤖 AI Summary
This study addresses the challenge posed by server-side Google Analytics (sGA), which evades existing anti-tracking mechanisms due to browser-enforced restrictions on client-side tracking. To counter this, the authors propose SST-Guard, a novel system that introduces a semantic-value-template-based multimodal detection approach. By analyzing semantic cues—such as identifiers and event metadata—across any endpoint involved in data collection or sharing, SST-Guard accurately detects and blocks sGA without relying on predefined tracking endpoints. This design effectively mitigates evasion techniques like endpoint customization and payload obfuscation. Empirical evaluation on the Tranco top-10k websites identifies 403 sGA domains with over 93% precision, and further analysis of the top-150k sites reveals 6,314 websites employing sGA.
📝 Abstract
As web browsers increasingly restrict client-side tracking, the web tracking ecosystem is shifting from client-side to server-side tracking (SST). In SST, the browser sends tracking requests to an intermediate endpoint, which then forwards them to the tracker's endpoint, eliminating direct client-to-tracker requests. As a result, existing tracking protections that block requests to known tracker endpoints are rendered ineffective. In this paper, we investigate server-side implementation of Google Analytics, the most widely deployed third-party tracking service on the web today. We also present SST-Guard, a multi-modal, browser-based system for detecting and blocking server-side Google Analytics (sGA). Our key insight is that even when the tracker's endpoints change, sGA must necessarily still collect and share the same semantic information as client-side Google Analytics (e.g., identifiers, event metadata). Therefore, rather than detecting requests to known Google Analytics endpoints, SST-Guard aims to detect underlying artifacts of collection and sharing of these semantic values to any arbitrary endpoint. Operationalizing this insight is challenging because real-world sGA deployments commonly customize endpoints and obfuscate URLs/payloads. SST-Guard addresses this challenge using a value-template approach that employs regular expressions to match semantic value patterns across multiple modalities: network requests, cookies, and the window object. We validate SST-Guard on Tranco top-10k websites, detecting 4.02\% (403) sGA domains with over 93\% accuracy across three modalities, with network request classifier demonstrating the highest accuracy (99.8\%). By deploying SST-Guard in the wild, we find 4.21\% (6,314) of Tranco top-150k websites using sGA.
Problem

Research questions and friction points this paper is trying to address.

server-side tracking
Google Analytics
web tracking
tracking detection
privacy
Innovation

Methods, ideas, or system contributions that make the work stand out.

server-side tracking
Google Analytics detection
semantic value patterns
multi-modal analysis
value-template approach
🔎 Similar Papers