A Topic-wise Exploration of the Telegram Group-verse

📅 2024-09-04
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenges of topic diversity and user behavioral heterogeneity in Telegram’s public groups. We develop the first open-source, automated data collection framework—built on Telethon—to harvest over 50 million multilingual messages from 669 public groups spanning education, politics, cryptocurrency, and adult content. We propose a fine-grained topical analysis paradigm, systematically uncovering differential propagation patterns of videos and stickers across topics for the first time. Our analysis identifies illicit content distribution pathways and counterintuitive social behaviors—e.g., significantly lower bot prevalence in adult groups than in political ones—and quantifies statistically significant cross-topic variations in linguistic diversity, bot activity, and multimedia usage intensity. The contributions include an open dataset, an analytical toolkit, and a fully reproducible methodology—providing empirical foundations and technical infrastructure for platform governance and cross-topic social behavior research.

Technology Category

Application Category

📝 Abstract
Although currently one of the most popular instant messaging apps worldwide, Telegram has been largely understudied in the past years. In this paper, we aim to address this gap by presenting an analysis of publicly accessible groups covering discussions encompassing different topics, as diverse as Education, Erotic, Politics, and Cryptocurrencies. We engineer and offer an open-source tool to automate the collection of messages from Telegram groups, a non-straightforward problem. We use it to collect more than 50 million messages from 669 groups. Here, we present a first-of-its-kind, per-topic analysis, contrasting the characteristics of the messages sent on the platform from different angles -- the language, the presence of bots, the type and volume of shared media content. Our results confirm some anecdotal evidence, e.g., clues that Telegram is used to share possibly illicit content, and unveil some unexpected findings, e.g., the different sharing patterns of video and stickers in groups of different topics. While preliminary, we hope that our work paves the road for several avenues of future research on the understudied Telegram platform.
Problem

Research questions and friction points this paper is trying to address.

Analyzes user interactions across diverse Telegram group topics
Develops open-source tool for automated Telegram message collection
Explores activity patterns and content sharing in topic-specific groups
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed open-source tool for Telegram message collection
Analyzed 51M messages across 669 diverse topic groups
Conducted per-topic analysis of user activity patterns
🔎 Similar Papers
No similar papers found.
A
A. Perlo
Politecnico di Torino, Torino, Italy
G
Giordano Paoletti
Politecnico di Torino, Torino, Italy
N
Nikhil Jha
Politecnico di Torino, Torino, Italy
L
L. Vassio
Politecnico di Torino, Torino, Italy
Jussara M. Almeida
Jussara M. Almeida
Department of Computer Science, Federal University of Minas Gerais, Brazil
Social ComputingPerformance ModelingUser Behavior Characterization
Marco Mellia
Marco Mellia
Politecnico di Torino, italy
Computer networksMachine LearningCybersecurityData Science