🤖 AI Summary
This work investigates the coordination capabilities of large language model (LLM) agents in multi-user collaborative information gathering—a setting where information is inherently distributed across users. To address this, the authors introduce PeopleJoin, a novel benchmark comprising two tasks: table-based question answering (PeopleJoin-QA) and collaborative document generation (PeopleJoin-DocCreation). Both require agents to autonomously identify suitable team members, retrieve fragmented information via cross-user dialogue, and synthesize coherent answers or structured documents. Methodologically, the approach extends ReAct and Plan-and-Execute frameworks with role-aware prompting, multi-step tool invocation, and explicit cross-user message routing. Empirical evaluation reveals a substantial performance drop for existing agents in cross-user coordination, systematically uncovering three fundamental challenges: “team discovery,” “intent alignment,” and “information fusion.” This work establishes the first dedicated benchmark and problem formulation for studying LLM-agent collaboration in distributed-information settings, advancing research on collaborative reasoning and organizational-scale information integration.
📝 Abstract
This paper introduces PeopleJoin, a benchmark for evaluating LM-mediated collaborative problem solving. Given a user request, PeopleJoin agents must identify teammates who might be able to assist, converse with these teammates to gather information, and finally compile a useful answer or summary for the original user. PeopleJoin comprises two evaluation domains: PeopleJoin-QA, focused on questions about tabular data, and PeopleJoin-DocCreation, focused on document creation tasks. The two domains are adapted from existing NLP benchmarks for database question answering and multi-document summarization; here, however, the information needed to complete these tasks is distributed across synthetic ``organizations'' of 2--20 users, simulating natural multi-user collaboration scenarios. We implemented several popular LM agent architectures, evaluating their accuracy and efficiency at completing tasks, and highlight new research questions that can be studied using PeopleJoin.