GRAPHIA: Harnessing Social Graph Data to Enhance LLM-Based Social Simulation

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of social graph supervision signals in LLM-based social simulation, this paper proposes the first general-purpose social graph simulation framework. It leverages real-world graph structures as reinforcement learning supervision signals and jointly models individual interactions and network evolution via a two-stage graph generation process: destination node selection followed by edge content generation. We innovatively design a dual-level (node- and network-level) alignment evaluation system, integrating a GNN-driven structural reward function with multi-dimensional metrics—including BERTScore for semantic fidelity and power-law distribution fitting for structural realism. Experiments on three real-world social networks demonstrate significant improvements: +6.1% micro-level behavioral alignment, +12% edge classification accuracy, and +27.9% edge content quality; at the macro level, structural similarity increases by 41.11%, and social phenomenon reproduction improves by 32.98%. The framework robustly supports counterfactual reasoning and platform incentive simulation.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have shown promise in simulating human-like social behaviors. Social graphs provide high-quality supervision signals that encode both local interactions and global network structure, yet they remain underutilized for LLM training. To address this gap, we propose Graphia, the first general LLM-based social graph simulation framework that leverages graph data as supervision for LLM post-training via reinforcement learning. With GNN-based structural rewards, Graphia trains specialized agents to predict whom to interact with (destination selection) and how to interact (edge generation), followed by designed graph generation pipelines. We evaluate Graphia under two settings: Transductive Dynamic Graph Generation (TDGG), a micro-level task with our proposed node-wise interaction alignment metrics; and Inductive Dynamic Graph Generation (IDGG), a macro-level task with our proposed metrics for aligning emergent network properties. On three real-world networks, Graphia improves micro-level alignment by 6.1% in the composite destination selection score, 12% in edge classification accuracy, and 27.9% in edge content BERTScore over the strongest baseline. For macro-level alignment, it achieves 41.11% higher structural similarity and 32.98% better replication of social phenomena such as power laws and echo chambers. Graphia also supports counterfactual simulation, generating plausible behavioral shifts under platform incentives. Our results show that social graphs can serve as high-quality supervision signals for LLM post-training, closing the gap between agent behaviors and network dynamics for LLM-based simulation. Code is available at https://github.com/Ji-Cather/Graphia.git.
Problem

Research questions and friction points this paper is trying to address.

Leveraging social graph data to enhance LLM-based social simulation
Training specialized agents for interaction destination selection and generation
Aligning agent behaviors with network dynamics through structural supervision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses social graph data for LLM supervision
Trains agents via reinforcement learning with GNN rewards
Generates graphs with destination selection and edge generation
🔎 Similar Papers
No similar papers found.
J
Jiarui Ji
Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China
Zehua Zhang
Zehua Zhang
Alimama Tech, Taobao & Tmall Group of Alibaba
Zhewei Wei
Zhewei Wei
Renmin University of China
Graph AlgorithmsStreaming AlgorithmsAI4ScienceAI4DB
Bin Tong
Bin Tong
Alimama Tech, Taobao & Tmall Group of Alibaba
G
Guan Wang
Alimama Tech, Taobao & Tmall Group of Alibaba
B
Bo Zheng
Alimama Tech, Taobao & Tmall Group of Alibaba