🤖 AI Summary
This study addresses the normative and consistency challenges in social media platforms’ reporting of content moderation data to the EU’s Digital Services Act (DSA) Transparency Database. Analyzing 353 million moderation records submitted by eight major platforms during the first 100 days—and cross-validating them against their self-published transparency reports—we conduct large-scale data cleaning, multi-source consistency verification, multidimensional statistical analysis, and cross-platform comparative modeling. Our analysis reveals, for the first time, a systemic misalignment between the database’s structural design and platform-level implementation practices. We quantify platform-specific implementation deviations—most notably an exceptionally high inconsistency rate for X—and find that overall data inconsistency reaches statistically significant levels. These findings establish an empirical benchmark for DSA regulatory refinement, inform targeted improvements to the Transparency Database architecture, and advance methodological paradigms in digital platform governance research.
📝 Abstract
Since September 2023, the Digital Services Act (DSA) obliges large online platforms to submit detailed data on each moderation action they take within the European Union (EU) to the DSA Transparency Database. From its inception, this centralized database has sparked scholarly interest as an unprecedented and potentially unique trove of data on real-world online moderation. Here, we thoroughly analyze all 353.12M records submitted by the eight largest social media platforms in the EU during the first 100 days of the database. Specifically, we conduct a platform-wise comparative study of their: volume of moderation actions, grounds for decision, types of applied restrictions, types of moderated content, timeliness in undertaking and submitting moderation actions, and use of automation. Furthermore, we systematically cross-check the contents of the database with the platforms' own transparency reports. Our analyses reveal that (i) the platforms adhered only in part to the philosophy and structure of the database, (ii) the structure of the database is partially inadequate for the platforms' reporting needs, (iii) the platforms exhibited substantial differences in their moderation actions, (iv) a remarkable fraction of the database data is inconsistent, (v) the platform X (formerly Twitter) presents the most inconsistencies. Our findings have far-reaching implications for policymakers and scholars across diverse disciplines. They offer guidance for future regulations that cater to the reporting needs of online platforms in general, but also highlight opportunities to improve and refine the database itself.