GRAPHIA: Harnessing Social Graph Data to Enhance LLM-Based Social Simulation

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

184K/year
🤖 AI Summary
To address the lack of social graph supervision signals in LLM-based social simulation, this paper proposes the first general-purpose social graph simulation framework. It leverages real-world graph structures as reinforcement learning supervision signals and jointly models individual interactions and network evolution via a two-stage graph generation process: destination node selection followed by edge content generation. We innovatively design a dual-level (node- and network-level) alignment evaluation system, integrating a GNN-driven structural reward function with multi-dimensional metrics—including BERTScore for semantic fidelity and power-law distribution fitting for structural realism. Experiments on three real-world social networks demonstrate significant improvements: +6.1% micro-level behavioral alignment, +12% edge classification accuracy, and +27.9% edge content quality; at the macro level, structural similarity increases by 41.11%, and social phenomenon reproduction improves by 32.98%. The framework robustly supports counterfactual reasoning and platform incentive simulation.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have shown promise in simulating human-like social behaviors. Social graphs provide high-quality supervision signals that encode both local interactions and global network structure, yet they remain underutilized for LLM training. To address this gap, we propose Graphia, the first general LLM-based social graph simulation framework that leverages graph data as supervision for LLM post-training via reinforcement learning. With GNN-based structural rewards, Graphia trains specialized agents to predict whom to interact with (destination selection) and how to interact (edge generation), followed by designed graph generation pipelines. We evaluate Graphia under two settings: Transductive Dynamic Graph Generation (TDGG), a micro-level task with our proposed node-wise interaction alignment metrics; and Inductive Dynamic Graph Generation (IDGG), a macro-level task with our proposed metrics for aligning emergent network properties. On three real-world networks, Graphia improves micro-level alignment by 6.1% in the composite destination selection score, 12% in edge classification accuracy, and 27.9% in edge content BERTScore over the strongest baseline. For macro-level alignment, it achieves 41.11% higher structural similarity and 32.98% better replication of social phenomena such as power laws and echo chambers. Graphia also supports counterfactual simulation, generating plausible behavioral shifts under platform incentives. Our results show that social graphs can serve as high-quality supervision signals for LLM post-training, closing the gap between agent behaviors and network dynamics for LLM-based simulation. Code is available at https://github.com/Ji-Cather/Graphia.git.
Problem

Research questions and friction points this paper is trying to address.

Leveraging social graph data to enhance LLM-based social simulation
Training specialized agents for interaction destination selection and generation
Aligning agent behaviors with network dynamics through structural supervision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses social graph data for LLM supervision
Trains agents via reinforcement learning with GNN rewards
Generates graphs with destination selection and edge generation
🔎 Similar Papers
2024-10-06Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)Citations: 13