🤖 AI Summary
This work addresses the limitation of existing large language models in simulating community governance, which typically rely solely on coarse-grained demographic data and thus fail to authentically capture residents’ nuanced perspectives. To overcome this, the study introduces a high-fidelity simulation dataset constructed from in-depth life-history narratives of 92 real residents. It further proposes curriculum-LoRA, a personalized fine-tuning algorithm that integrates curriculum learning with parameter-efficient LoRA adaptation and multi-strategy prompt engineering. The method achieves comparable or superior simulation fidelity to the strongest baseline while reducing per-inference cost to approximately one-tenth. Across all evaluated configurations, curriculum-LoRA demonstrates Pareto dominance over existing approaches and has been successfully integrated into a closed-loop policy evaluation system.
📝 Abstract
Effective community governance hinges on understanding what specific residents think and need. Recent work has used large language models (LLMs) to simulate human respondents, offering a scalable, reproducible way to study human attitudes and behaviors at low cost. However, these studies typically prompt the model with just a few demographic variables (age, gender, income), simulating only general role types. This is insufficient for community governance, where decisions depend on the views of specific residents. We bridge this gap with an integrated research framework covering dataset, benchmark, algorithm, and system. The dataset comprises approximately 1.2 million characters of first-person narrative collected through two-hour semi-structured interviews with each of 92 residents in an urban community, organized around nine community-governance domains. The benchmark probes 18 mainstream LLMs across four prompting strategies and shows that adding rich life-history profiles meaningfully raises fidelity above the no-profile baseline, but this gain comes with more input tokens per call from the longer prompts they require. The algorithm, curriculum-LoRA, is a parameter-efficient personalization framework that, by closing this fidelity-cost gap, matches the strongest baseline's fidelity at roughly 10x lower per-call cost and Pareto-dominates every configuration tested. The system integrates curriculum-LoRA into a closed-loop policy-evaluation pipeline. Together, these results bring individual-level LLM-based resident simulation within reach of resource-constrained local administrations, enabling community-governance decisions to be systematically pre-evaluated in silico before real-world deployment.