Prompt Optimization Across Multiple Agents for Representing Diverse Human Populations

📅 2025-10-08

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Large language models (LLMs) often produce homogeneous outputs when simulating human behavior, failing to capture population-level diversity. Method: This paper proposes a submodular optimization-based framework for constructing diverse multi-agent systems. Leveraging a small set of human demonstration data, it employs in-context learning and prompt engineering to elicit heterogeneous responses from LLMs, then efficiently selects the most representative subset of agents from an exponential-sized candidate space. Contribution/Results: The core innovation lies in formalizing behavioral diversity as a submodular function maximization problem, enabling polynomial-time algorithms with theoretical approximation guarantees. Experiments across crowdsourcing and educational settings demonstrate that our approach significantly outperforms single-agent baselines and existing methods, achieving higher fidelity in both behavioral pattern reproduction and opinion distribution alignment.

Technology Category

Application Category

📝 Abstract

The difficulty and expense of obtaining large-scale human responses make Large Language Models (LLMs) an attractive alternative and a promising proxy for human behavior. However, prior work shows that LLMs often produce homogeneous outputs that fail to capture the rich diversity of human perspectives and behaviors. Thus, rather than trying to capture this diversity with a single LLM agent, we propose a novel framework to construct a set of agents that collectively capture the diversity of a given human population. Each agent is an LLM whose behavior is steered by conditioning on a small set of human demonstrations (task-response pairs) through in-context learning. The central challenge is therefore to select a representative set of LLM agents from the exponentially large space of possible agents. We tackle this selection problem from the lens of submodular optimization. In particular, we develop methods that offer different trade-offs regarding time complexity and performance guarantees. Extensive experiments in crowdsourcing and educational domains demonstrate that our approach constructs agents that more effectively represent human populations compared to baselines. Moreover, behavioral analyses on new tasks show that these agents reproduce the behavior patterns and perspectives of the students and annotators they are designed to represent.

Problem

Research questions and friction points this paper is trying to address.

Optimizing prompt selection for diverse agent representation

Addressing homogeneous LLM outputs lacking human diversity

Constructing representative agent sets via submodular optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple LLM agents represent human diversity

Agents conditioned on human demonstrations via in-context learning

Submodular optimization selects representative agent sets

🔎 Similar Papers

Direct Preference Optimization With Unobserved Preference Heterogeneity

2024-05-23arXiv.orgCitations: 11

GenSim: A General Social Simulation Platform with Large Language Model based Agents

2024-10-06Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)Citations: 13

ByteDance

圣何塞

Multimodal Model Training and Inference Optimization Engineer

ByteDance

西雅图

Research Engineer, Machine Learning (Reinforcement Learning)

Anthropic

$500,000—$850,000 USD

San Francisco, CA, USA

Natural Language Processing Researcher

Kitware

Remote, USA: AL, AZ, CO, DC, FL, GA, IL, IN, MA, MD, ME, MN, NC, NM, NY, OH, OR, PA, TN, TX, UT, VA, WI

Authors to Follow