🤖 AI Summary
This study investigates whether large language models (LLMs) can establish human-like shared communicative conventions with humans to enable efficient dialogue. Through a multimodal communication game, the authors compare convention formation across human–human, LLM–LLM, and human–LLM pairings, evaluating performance using behavioral experiments, prompt engineering, and natural language metrics such as lexical overlap, turn length, and task accuracy. The results provide the first systematic evidence that homogeneous pairs—both human–human and LLM–LLM—effectively develop shared conventions, whereas human–LLM pairs exhibit significantly lower accuracy and lexical consistency, even when message lengths are matched. These findings demonstrate that merely mimicking human linguistic patterns is insufficient for achieving true alignment, underscoring the critical role of shared semantic biases in establishing effective communicative conventions.
📝 Abstract
Humans align to one another in conversation -- adopting shared conventions that ease communication. We test whether LLMs form the same kinds of conventions in a multimodal communication game. Both humans and LLMs display evidence of convention-formation (increasing the accuracy and consistency of their turns while decreasing their length) when communicating in same-type dyads (humans with humans, AI with AI). However, heterogenous human-AI pairs fail -- suggesting differences in communicative tendencies. In Experiment 2, we ask whether LLMs can be induced to behave more like human conversants, by prompting them to produce superficially humanlike behavior. While the length of their messages matches that of human pairs, accuracy and lexical overlap in human-LLM pairs continues to lag behind that of both human-human and AI-AI pairs. These results suggest that conversational alignment requires more than just the ability to mimic previous interactions, but also shared interpretative biases toward the meanings that are conveyed.