Success and Cost Elicit Convention Formation for Efficient Communication

📅 2025-10-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of enabling intelligent models to spontaneously develop stable, efficient communication conventions in human–AI interaction—a problem unmet by existing approaches that rely on manual annotations or single-objective optimization (e.g., success rate alone). We propose a self-supervised reinforcement learning framework grounded in multimodal large language models, jointly optimizing for both communicative success and information cost within simulated referential games. Leveraging shared visual context, our method enables unsupervised convention emergence without external linguistic supervision. Experiments on image reference tasks demonstrate a 41% reduction in message length, a 15% increase in success rate, and significantly faster human response times. Our key contribution is the first empirical validation that *dual-objective co-optimization* is essential for robust linguistic convention formation, and we establish a scalable, task-grounded training paradigm for language evolution tailored to real-world human–AI collaboration.

Technology Category

Application Category

📝 Abstract
Humans leverage shared conversational context to become increasingly successful and efficient at communicating over time. One manifestation of this is the formation of ad hoc linguistic conventions, which allow people to coordinate on short, less costly utterances that are understood using shared conversational context. We present a method to train large multimodal models to form conventions, enabling efficient communication. Our approach uses simulated reference games between models, and requires no additional human-produced data. In repeated reference games involving photographs and tangram images, our method enables models to communicate efficiently with people: reducing the message length by up to 41% while increasing success by 15% over the course of the interaction. Human listeners respond faster when interacting with our model that forms conventions. We also show that training based on success or cost alone is insufficient - both are necessary to elicit convention formation.
Problem

Research questions and friction points this paper is trying to address.

Developing efficient communication through convention formation
Training models to reduce message length while increasing success
Using success and cost together to elicit convention formation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Trains models via simulated reference games
Uses success and cost metrics for convention formation
Enables efficient communication without human data
🔎 Similar Papers
No similar papers found.