🤖 AI Summary
This study addresses the frequent disputes in Web3 prediction markets stemming from flawed mechanisms and proposes an efficient decentralized arbitration solution. For the first time, it introduces large language models (LLMs) with web retrieval capabilities into on-chain dispute resolution, leveraging real-world data from Polymarket and the UMA protocol. The work evaluates LLMs’ ability to replicate UMA’s oracle-based voting outcomes and explores their potential to predict contentious events in advance. Experimental results show that while LLMs struggle to reliably anticipate disputes beforehand, they achieve a stable 89.58% agreement rate with UMA’s final rulings in post-hoc adjudication. These findings demonstrate the feasibility and promise of integrating LLMs as decision-support tools in decentralized governance frameworks.
📝 Abstract
Web3 prediction markets, exemplified by Polymarket, have gained prominence for leveraging collective intelligence to forecast a wide range of social, political, and sports events. However, among the thousands of prediction market events, consensus disputes still arise due to imperfections in market mechanisms. On Polymarket alone, the trading volume involving disputed events has reached $972,370,804.71, underscoring the critical need for objective and efficient dispute resolution. In this study, we introduce large language models (LLMs) to: (1) evaluate whether web-enabled LLMs can reproduce the decision quality of UMA's on-chain voting process once a dispute has been raised, and (2) predict, based on event rules, which market events are likely to face future disputes before they occur. Our findings show that LLMs are unable to reliably predict which events will become disputed in advance; however, once a dispute is initiated, web-enabled LLMs achieve 89.58% agreement with UMA's final resolutions and demonstrate strong stability.