The Schwurbelarchiv: a German Language Telegram dataset for the Study of Conspiracy Theories

📅 2025-04-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the scarcity of high-quality empirical resources for conspiracy discourse in German on Telegram. We introduce Schwurbelarchiv—the first large-scale, multimodal, German-specific conspiracy theory dataset—compiled from over 6,000 public Telegram groups and channels. It comprises 40 million text messages and 3 million forwarded audio files. Leveraging a German-tuned ASR system, language identification, and timestamp validation, we achieve >98% confidence in German-language annotation. The dataset undergoes linguistic validation and supports joint text–speech analysis with fine-grained temporal provenance. Its primary contributions are threefold: (1) filling a critical gap in empirical research on extremist discourse on Telegram; (2) enabling reproducible, scalable modeling of disinformation diffusion and ideological radicalization trajectories; and (3) providing a benchmark resource for structural analysis of adversarial social networks.

Technology Category

Application Category

📝 Abstract
Sociality borne by language, as is the predominant digital trace on text-based social media platforms, harbours the raw material for exploring multiple social phenomena. Distinctively, the messaging service Telegram provides functionalities that allow for socially interactive as well as one-to-many communication. Our Telegram dataset contains over 6,000 groups and channels, 40 million text messages, and over 3 million transcribed audio files, originating from a data-hoarding initiative named the ``Schwurbelarchiv'' (from German schwurbeln: speaking nonsense). This dataset publication details the structure, scope, and methodological specifics of the Schwurbelarchiv, emphasising its relevance for further research on the German-language conspiracy theory discourse. We validate its predominantly German origin by linguistic and temporal markers and situate it within the context of similar datasets. We describe process and extent of the transcription of multimedia files. Thanks to this effort the dataset uniquely supports multimodal analysis of online social dynamics and content dissemination. Researchers can employ this resource to explore societal dynamics in misinformation, political extremism, opinion adaptation, and social network structures on Telegram. The Schwurbelarchiv thus offers unprecedented opportunities for investigations into digital communication and its societal implications.
Problem

Research questions and friction points this paper is trying to address.

Study German-language conspiracy theories on Telegram
Analyze multimodal social dynamics in misinformation
Explore political extremism and opinion adaptation online
Innovation

Methods, ideas, or system contributions that make the work stand out.

Telegram dataset with 6000 groups and channels
Includes 40M texts and 3M transcribed audio files
Supports multimodal analysis of social dynamics
🔎 Similar Papers
No similar papers found.