🤖 AI Summary
Terms of Service (ToS) are notoriously lengthy and opaque, exacerbating information asymmetry and legal risks between users and online platforms. To address this, we propose TOSense—a lightweight Chrome extension enabling real-time, natural-language querying of key privacy policy and ToS provisions. Methodologically, we design an end-to-end LLM pipeline: (i) tos-crawl for automated document acquisition; (ii) MiniLM for semantic retrieval; and (iii) BART-encoder for answer relevance verification. We further introduce QEP, a novel question-answering evaluation framework that generates synthetic, topic-clustered questions—eliminating reliance on manual annotation. Empirical evaluation across five major platforms (e.g., Apple, Google) demonstrates a peak accuracy of 44.5%, supports zero-shot domain adaptation, and enables interactive, index-free querying. This work establishes a new paradigm for interpretable, scalable, and user-centric ToS comprehension.
📝 Abstract
Online services often require users to agree to lengthy and obscure Terms of Service (ToS), leading to information asymmetry and legal risks. This paper proposes TOSense-a Chrome extension that allows users to ask questions about ToS in natural language and get concise answers in real time. The system combines (i) a crawler "tos-crawl" that automatically extracts ToS content, and (ii) a lightweight large language model pipeline: MiniLM for semantic retrieval and BART-encoder for answer relevance verification. To avoid expensive manual annotation, we present a novel Question Answering Evaluation Pipeline (QEP) that generates synthetic questions and verifies the correctness of answers using clustered topic matching. Experiments on five major platforms, Apple, Google, X (formerly Twitter), Microsoft, and Netflix, show the effectiveness of TOSense (with up to 44.5% accuracy) across varying number of topic clusters. During the demonstration, we will showcase TOSense in action. Attendees will be able to experience seamless extraction, interactive question answering, and instant indexing of new sites.