PolicySim: An LLM-Based Agent Social Simulation Sandbox for Proactive Policy Optimization

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work proposes PolicySim, a large language model (LLM)-based social simulation sandbox designed to proactively evaluate the societal risks of platform intervention policies—such as recommendation algorithms and content filtering—that may exacerbate filter bubbles and polarization. Unlike existing approaches reliant on post-hoc A/B testing, PolicySim introduces, for the first time in LLM-driven simulations, a dynamic feedback mechanism that models platform interventions through contextual bandits and message-passing architectures to capture evolving user networks. User agents are trained via supervised fine-tuning (SFT) and direct preference optimization (DPO) to adaptively respond to interventions. The framework significantly enhances simulation fidelity and operational utility at both micro-behavioral and macro-ecological levels, outperforming current simulation paradigms and enabling prospective assessment and refinement of intervention strategies.

Technology Category

Application Category

📝 Abstract

Social platforms serve as central hubs for information exchange, where user behaviors and platform interventions jointly shape opinions. However, intervention policies like recommendation and content filtering, can unintentionally amplify echo chambers and polarization, posing significant societal risks. Proactively evaluating the impact of such policies is therefore crucial. Existing approaches primarily rely on reactive online A/B testing, where risks are identified only after deployment, making risk identification delayed and costly. LLM-based social simulations offer a promising pre-deployment alternative, but current methods fall short in realistically modeling platform interventions and incorporating feedback from the platform. Bridging these gaps is essential for building actionable frameworks to assess and optimize platform policies. To this end, we propose PolicySim, an LLM-based social simulation sandbox for the proactive assessment and optimization of intervention policies. PolicySim models the bidirectional dynamics between user behavior and platform interventions through two key components: (1) a user agent module refined via supervised fine-tuning (SFT) and direct preference optimization (DPO) to achieve platform-specific behavioral realism; and (2) an adaptive intervention module that employs a contextual bandit with message passing to capture dynamic network structures. Experiments show that PolicySim can accurately simulate platform ecosystems at both micro and macro levels and support effective intervention policy.

Problem

Research questions and friction points this paper is trying to address.

policy evaluation

social simulation

platform intervention

echo chambers

LLM-based agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based social simulation

policy optimization

adaptive intervention

contextual bandit

behavioral realism

🔎 Similar Papers

Instigating Cooperation among LLM Agents Using Adaptive Information Modulation

2024-09-16arXiv.orgCitations: 3

GenSim: A General Social Simulation Platform with Large Language Model based Agents

2024-10-06Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)Citations: 13

TikTok

San Jose, California

Staff Research Engineer, LLM - TikTok Ads Core ML, Ranking

TikTok

San Jose, California

Machine Learning Engineer, Commerce Ads Ranking

TikTok

San Jose, California

Research Engineer - TikTok Ads Core ML, Ranking

TikTok

San Jose, California

Authors to Follow