🤖 AI Summary
This study investigates the multi-agent collaboration capabilities of large language model (LLM) agents under information asymmetry—i.e., heterogeneous knowledge and skill distributions. To this end, we design an extended Einstein’s Riddle tabletop environment requiring two agents to jointly satisfy spatial and relational constraints via reasoning, natural-language communication, and physical actions. We propose a “fine-tuning + environment validator” framework: it introduces an aligned bidirectional communication protocol and integrates an interpretable, rule-grounded environmental feedback mechanism to enhance task-rule comprehension and behavioral verifiability. Experiments demonstrate that our framework significantly improves collaborative success rates and human trust. Compared to a no-communication baseline, bidirectional communication enables superior performance, deeper internalization of task rules, and greater behavioral explainability. Further integrating the environment validator strengthens safety guarantees and fosters trustworthy, verifiable coordination.
📝 Abstract
While Large Language Model (LLM) agents are often approached from the angle of action planning/generation to accomplish a goal (e.g., given by language descriptions), their abilities to collaborate with each other to achieve a joint goal are not well explored. To address this limitation, this paper studies LLM agents in task collaboration, particularly under the condition of information asymmetry, where agents have disparities in their knowledge and skills and need to work together to complete a shared task. We extend Einstein Puzzles, a classical symbolic puzzle, to a table-top game. In this game, two LLM agents must reason, communicate, and act to satisfy spatial and relational constraints required to solve the puzzle. We apply a fine-tuning-plus-verifier framework in which LLM agents are equipped with various communication strategies and verification signals from the environment. Empirical results highlight the critical importance of aligned communication, especially when agents possess both information-seeking and -providing capabilities. Interestingly, agents without communication can still achieve high task performance; however, further analysis reveals a lack of true rule understanding and lower trust from human evaluators. Instead, by integrating an environment-based verifier, we enhance agents' ability to comprehend task rules and complete tasks, promoting both safer and more interpretable collaboration in AI systems. https://github.com/Roihn/EinsteinPuzzles