๐ค AI Summary
This study addresses the challenge users face in submitting high-quality reports of illegal content under the EU Digital Services Act due to difficulties in interpreting complex legal provisions. To support user judgment in ambiguous scenarios, the paper proposes an โEvaluative AIโ (EvalAI) interaction paradigm that presents balanced pro-con arguments grounded in multiple legal clauses. A controlled user study was conducted on a simulated reporting platform comparing three conditions: no assistance, traditional explainable AI (XAI), and EvalAI, leveraging large language models and systematic simulation of AI errors. Findings indicate that when the AI is incorrect, EvalAI significantly improves the accuracy of legal clause selection and reduces misclassification bias; when the AI is correct, XAI accelerates decision-making. However, neither approach substantially enhances the quality of usersโ justifications.
๐ Abstract
Illegal content reporting mechanisms are a key technical and organizational measure through which online platforms address illegal content under the European Union Digital Services Act (DSA). Article 16 requires user notices to be sufficiently substantiated and submitted in good faith, placing users in the difficult position of interpreting legal and procedural language and translating ambiguous content into legally meaningful categories and reasons. We investigate how large language model (LLM)-based assistants can support this reporting process. In a controlled user study (N = 450) using an interface modeled on a major platform reporting workflow, we compare three conditions: unaided reporting, a conventional explainable AI assistant (XAI) that suggests a single legal category with a rationale, and an evaluative AI assistant (EvalAI) that presents balanced pro and con arguments across candidate legal provisions. We further examine these assistance forms under systematically varied AI error regimes. Our results show that EvalAI improves provision-level accuracy under AI error and reduces misclassification distance relative to conventional XAI, particularly for near-miss and overbreadth errors. When AI output is correct, conventional XAI enables faster decisions, but neither AI assistance form reliably improves the quality of users' substantiated explanations relative to unaided reporting. We discuss design implications for compliance-oriented reporting interfaces, highlighting trade-offs between accuracy, deliberation, explanation quality, and vulnerability to misleading AI output.