🤖 AI Summary
Current understanding of the dynamic interactions among participants, data assets, and regulatory frameworks in data markets remains fragmented and insufficiently systematic.
Method: This paper proposes a large language model (LLM)-driven multi-agent simulation framework that transcends traditional rule-based modeling paradigms. It endows buyers and sellers with autonomous planning, semantic search, dynamic pricing, and strategic trading capabilities. Crucially, it integrates natural language inference to enable agents to adapt behavioral policies in real time based on market feedback, thereby emergently reproducing realistic trading patterns.
Contribution/Results: Experimental evaluation demonstrates significant improvements over baseline approaches across key metrics—including purchase distribution, buyer retention rate, and repeat transaction frequency. The framework effectively uncovers the generative mechanisms and evolutionary trajectories underlying data market trends, offering a scalable, interpretable, and behaviorally grounded simulation paradigm for data market analysis.
📝 Abstract
Data marketplaces, which mediate the purchase and exchange of data from third parties, have attracted growing attention for reducing the cost and effort of data collection while enabling the trading of diverse datasets. However, a systematic understanding of the interactions between market participants, data, and regulations remains limited. To address this gap, we propose a Large Language Model-based Multi-Agent System (LLM-MAS) for data marketplaces. In our framework, buyer and seller agents powered by LLMs operate with explicit objectives and autonomously perform strategic actions, such as planning, searching, purchasing, pricing, and updating data. These agents can reason about market dynamics, forecast future demand, and adjust strategies accordingly. Unlike conventional model-based simulations, which are typically constrained to predefined rules, LLM-MAS supports broader and more adaptive behavior selection through natural language reasoning. We evaluated the framework via simulation experiments using three distribution-based metrics: (1) the number of purchases per dataset, (2) the number of purchases per buyer, and (3) the number of repeated purchases of the same dataset. The results demonstrate that LLM-MAS more faithfully reproduces trading patterns observed in real data marketplaces compared to traditional approaches, and further captures the emergence and evolution of market trends.