🤖 AI Summary
Tor is vulnerable to Website Fingerprinting (WFP) attacks—especially in open-world settings—with high accuracy, while existing defenses struggle to jointly optimize privacy, performance, and usability; moreover, no systematic survey exists. This paper first unifies WFP research into three core directions: dataset construction, attack modeling, and defense mechanisms. Through taxonomic classification and multi-dimensional comparative analysis—including threat models, feature granularity, and emerging challenges like multi-label browsing—we comprehensively evaluate the trade-offs of state-of-the-art methods. We propose the first holistic framework covering the full technical stack: machine learning–based detection, traffic shaping, adversarial perturbation, and adaptive padding. Our analysis identifies key bottlenecks and practical deployment constraints, and distills open problems and future research directions—providing both theoretical foundations and actionable guidance for enhancing Tor’s privacy guarantees.
📝 Abstract
The Tor network provides users with strong anonymity by routing their internet traffic through multiple relays. While Tor encrypts traffic and hides IP addresses, it remains vulnerable to traffic analysis attacks such as the website fingerprinting (WF) attack, achieving increasingly high fingerprinting accuracy even under open-world conditions. In response, researchers have proposed a variety of defenses, ranging from adaptive padding, traffic regularization, and traffic morphing to adversarial perturbation, that seek to obfuscate or reshape traffic traces. However, these defenses often entail trade-offs between privacy, usability, and system performance. Despite extensive research, a comprehensive survey unifying WF datasets, attack methodologies, and defense strategies remains absent. This paper fills that gap by systematically categorizing existing WF research into three key domains: datasets, attack models, and defense mechanisms. We provide an in-depth comparative analysis of techniques, highlight their strengths and limitations under diverse threat models, and discuss emerging challenges such as multi-tab browsing and coarse-grained traffic features. By consolidating prior work and identifying open research directions, this survey serves as a foundation for advancing stronger privacy protection in Tor.