Why Keep Your Doubts to Yourself? Trading Visual Uncertainties in Multi-Agent Bandit Systems

📅 2026-01-26
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high coordination costs and low efficiency commonly encountered by multi-agent vision systems under information asymmetry, problems exacerbated by existing approaches that overlook the structural nature of uncertainty and lack economic sustainability. To this end, we propose Agora, a novel framework that formalizes epistemic uncertainty as a tradable asset and establishes a decentralized uncertainty market, wherein agents are incentivized through economic mechanisms to exchange uncertainty at the levels of perception, semantics, and reasoning. Agora integrates vision-language models with multi-armed bandits and Thompson Sampling to devise market-aware brokerage strategies. Experiments demonstrate that Agora significantly outperforms current methods across five multimodal benchmarks, achieving an 8.5% accuracy gain on MMMU while reducing coordination costs by more than threefold.

Technology Category

Application Category

📝 Abstract
Vision-Language Models (VLMs) enable powerful multi-agent systems, but scaling them is economically unsustainable: coordinating heterogeneous agents under information asymmetry often spirals costs. Existing paradigms, such as Mixture-of-Agents and knowledge-based routers, rely on heuristic proxies that ignore costs and collapse uncertainty structure, leading to provably suboptimal coordination. We introduce Agora, a framework that reframes coordination as a decentralized market for uncertainty. Agora formalizes epistemic uncertainty into a structured, tradable asset (perceptual, semantic, inferential), and enforces profitability-driven trading among agents based on rational economic rules. A market-aware broker, extending Thompson Sampling, initiates collaboration and guides the system toward cost-efficient equilibria. Experiments on five multimodal benchmarks (MMMU, MMBench, MathVision, InfoVQA, CC-OCR) show that Agora outperforms strong VLMs and heuristic multi-agent strategies, e.g., achieving +8.5% accuracy over the best baseline on MMMU while reducing cost by over 3x. These results establish market-based coordination as a principled and scalable paradigm for building economically viable multi-agent visual intelligence systems.
Problem

Research questions and friction points this paper is trying to address.

multi-agent systems
visual uncertainty
information asymmetry
cost-efficient coordination
vision-language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

uncertainty trading
multi-agent coordination
vision-language models
market-based AI
epistemic uncertainty
🔎 Similar Papers
No similar papers found.
J
Jusheng Zhang
Sun Yat-sen University
Y
Yijia Fan
Sun Yat-sen University
K
Kaitong Cai
Sun Yat-sen University
J
Jing Yang
Sun Yat-sen University
Jiawei Yao
Jiawei Yao
Ph.D. Student, University of Washington
Machine LearningLarge Language Model
Jian Wang
Jian Wang
Snap Inc.
Computer visionsignal processing
G
Guanlong Qu
Syracuse University
Ziliang Chen
Ziliang Chen
AP, Pengcheng Lab
Machine learningFoundation ModelsMultimodal Embodied Intelligence
K
Keze Wang
Sun Yat-sen University