Revenue-Optimal Pricing for Budget-Constrained Buyers in Data Markets

📅 2026-02-14

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This study addresses the design of optimal pricing mechanisms for data markets under budget-constrained rational buyers. Focusing on scenarios where buyers aim to maximize predictive accuracy by selecting data bundles within a fixed budget, the work proposes a class of monotone continuous pricing functions to maximize total market revenue. Theoretical analysis reveals that the optimal pricing function exhibits a piecewise-linear convex structure, with the number of breakpoints bounded by the number of buyers. While general nonlinear pricing can be solved in polynomial time, linear pricing—despite its apparent simplicity—is shown to be APX-hard, highlighting a striking computational dichotomy. To address this challenge, the paper develops an online 2-approximation algorithm and an offline $(1-1/e)^{-1}$-approximation algorithm, providing foundational insights for pricing theory in data markets.

Technology Category

Application Category

📝 Abstract

We study revenue-optimal pricing in data markets with rational, budget-constrained buyers. Such a market offers multiple datasets for sale, and buyers aim to improve the accuracy of their prediction tasks by acquiring data bundles. For each dataset, the market sets a pricing function, which maps the number of records purchased from the dataset to a non-negative price. The market's objective is to set these pricing functions to maximize total revenue, considering that buyers with quasi-linear utilities choose their bundles optimally under budget constraints. We analyze optimal pricing when each dataset's pricing function is only required to be monotone and lower-continuous. Surprisingly, even with this generality, optimal pricing has a highly structured form: it is piecewise linear and convex (PLC) and can be computed efficiently via an LP. Moreover, the total number of kinks across all pricing functions is bounded by the number of buyers. Thus, when datasets far outnumber buyers, most pricing functions are effectively linear. This motivates studying linear pricing, where each record in a dataset is priced uniformly. Although competitive equilibrium gives revenue-optimal linear prices in rivalrous markets with quasi-linear buyers, we show that revenue maximization under linear pricing in data markets is APX-hard. Hence, a striking computational dichotomy emerges: fully general (nonlinear) pricing admits a polynomial-time algorithm, while the simpler linear scheme is APX-hard. Despite the hardness, we design a 2-approximation algorithm when datasets arrive online, and a $(1-1/e)^{-1}$-approximation algorithm for the offline setting. Our framework lays the groundwork for exploring more general pricing schemes, richer utility models, and a deeper understanding of how market structure -- rivalrous versus non-rivalrous -- shapes revenue-optimal pricing.

Problem

Research questions and friction points this paper is trying to address.

revenue-optimal pricing

budget-constrained buyers

data markets

pricing functions

quasi-linear utilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

revenue-optimal pricing

budget-constrained buyers

piecewise linear convex pricing