🤖 AI Summary
Centralized social platforms face criticism for excessive control, opaque content moderation, and privacy risks, driving user migration to federated alternatives like Mastodon and Threads—both built on the ActivityPub protocol. However, empirical research on cross-platform interaction remains severely hindered by the absence of large-scale, real-world data. To address this gap, we introduce the first large-scale, longitudinal dataset of cross-federated interactions, capturing over 20,000 Threads and 20,000 Mastodon users’ authentic ActivityPub-mediated activities—including posts, replies, and likes—over ten months, yielding >10 million structured interaction events. Our methodology features distributed node log collection, cross-instance user alignment, and standardized event annotation. Empirical analysis reveals that federation significantly enhances topical coherence (+32%) and cross-network user engagement (+47%). This open dataset fills a critical empirical void in decentralized social network interoperability research.
📝 Abstract
Traditional social media platforms, once envisioned as digital town squares, face growing criticism over corporate control, content moderation, and privacy concerns. Events such as Twitter's acquisition(now X) and major policy changes have driven users toward alternative platforms like Mastodon and Threads. However, this diversification has led to user dispersion and fragmented discussions across isolated social media platforms. To address these issues, federation protocols like ActivityPub have been adopted, with Mastodon leading efforts to build decentralized yet interconnected networks. In March 2024, Threads joined this federation by introducing its Fediverse Sharing service, which enables interactions such as posts, replies, and likes between Threads and Mastodon users as if on a unified platform. Building on this development, we introduce FediverseSharing, the first dataset capturing interactions between 20,000+ Threads users and 20,000+ Mastodon users over a ten-month period. This dataset serves as a foundation for studying cross-platform interactions and the impact of federation as previously two separate platforms integrate.