DeTox-Fed: Detecting Toxic Conversations in the Fediverse with Federated Graph Neural Networks

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

214K/year
🤖 AI Summary
This work addresses the challenge of detecting toxic content in decentralized social networks such as the Fediverse, where data silos across instances, heterogeneous moderation policies, and restricted conversation visibility hinder effective toxicity identification. The paper proposes the first federated graph learning framework that enables each instance to locally construct conversation graphs and collaboratively train a graph neural network without sharing raw data or labels. By integrating dialogue structure, user interactions, statistical features, and aggregated sentiment signals, the approach balances privacy preservation with detection performance. Experiments on a large-scale Pleroma dataset demonstrate that the framework robustly and efficiently identifies toxic conversations, even under practical constraints including scarce local labels, partial client participation, and dynamically shifting moderation thresholds.
📝 Abstract
The rise of decentralized social networks (DSNs), and in particular the rapid uptake of the Fediverse (e.g., Pleroma, Mastodon, Lemygrad), introduces new challenges in content moderation. Independent instances host their own data, follow different moderation policies, and often observe only partial views of conversations. We present DeTox-Fed, a federated graph-learning framework for detecting toxic conversations in DSNs without requiring instances to share raw conversations or moderation labels. Each instance constructs a local conversation graph, where nodes represent conversation trees and edges capture shared user participation across conversations. A Graph Neural Network (GNN) is then trained in a federated learning setup, allowing instances to collaboratively learn a toxicity classifier while preserving data locality. Unlike text-only moderation approaches, DeTox-Fed combines conversational structure, user-interaction patterns, conversation-level statistics, and aggregate sentiment signals. We evaluate the framework on a large Pleroma conversation dataset and show that it achieves stable toxic conversation detection under limited local labels, partial client participation, and varying moderation thresholds. Our results indicate that federated graph-based moderation is a promising direction for semi-automated moderation in decentralized social networks.
Problem

Research questions and friction points this paper is trying to address.

decentralized social networks
toxic conversation detection
content moderation
Fediverse
data locality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning
Graph Neural Networks
Decentralized Social Networks
Toxic Conversation Detection
Content Moderation
🔎 Similar Papers
No similar papers found.