🤖 AI Summary
This study addresses the fragmentation of prediction markets caused by the absence of a unified definition of event identity, which leads to dispersed liquidity, hindered arbitrage, and systematic violations of the law of one price, thereby impairing information aggregation efficiency. The work introduces the concept of “semantic non-fungibility” as a fundamental barrier to price convergence and constructs the first human-validated, cross-platform alignment dataset of prediction market events, encompassing over 100,000 events across ten major platforms. By integrating natural language processing, semantic alignment algorithms, and manual verification, the study explicitly models equivalence in event descriptions, settlement semantics, and temporal scope. Empirical analysis reveals that approximately 6% of events are listed concurrently on multiple platforms, with semantically equivalent markets exhibiting persistent price discrepancies of 2–4%, uncovering structurally driven arbitrage opportunities.
📝 Abstract
Prediction markets are designed to aggregate dispersed information about future events, yet today's ecosystem is fragmented across heterogeneous operator-run platforms and blockchain-based protocols that independently list economically identical events. In the absence of a shared notion of event identity, liquidity fails to pool across venues, arbitrage becomes capital-intensive or unenforceable, and prices systematically violate the Law of One Price. As a result, market prices reflect platform-local beliefs rather than a single, globally aggregated probability, undermining the core information-aggregation function of prediction markets. We address this gap by introducing a semantic alignment framework that makes cross-platform event identity explicit through joint analysis of natural-language descriptions, resolution semantics, and temporal scope. Applying this framework, we construct the first human-validated, cross-platform dataset of aligned prediction markets, covering over 100 000 events across ten major venues from 2018 to 2025. Using this dataset, we show that roughly 6% of all events are concurrently listed across platforms and that semantically equivalent markets exhibit persistent execution-aware price deviations of 2-4% on average, even in highly liquid and information-rich settings. These mispricings give rise to persistent cross-platform arbitrage opportunities driven by structural frictions rather than informational disagreement. Overall, our results demonstrate that semantic non-fungibility is a fundamental barrier to price convergence, and that resolving event identity is a prerequisite for prediction markets to aggregate information at a global scale.