Trading Vector Data in Vector Databases

📅 2025-11-10

📈 Citations: 1

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This paper addresses three key challenges in cross-domain learning for vector databases: heterogeneous/partial feedback in retrieval configuration, variable/complex feedback in pricing learning, and strong coupling between configuration and pricing decisions. We propose the first hierarchical Bandit framework to jointly optimize both tasks. Our method features a two-stage learning mechanism: (i) a context-aware clustering and confidence-driven exploration stage for low-regret configuration learning; and (ii) an interval-based pricing strategy coupled with local Taylor approximation to model buyer response, effectively decoupling the joint decision-making process. Theoretically, our algorithm achieves polynomial time complexity and a sublinear regret bound. Extensive experiments on four real-world datasets demonstrate significant improvements in cumulative revenue and regret reduction over state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

Vector data trading is essential for cross-domain learning with vector databases, yet it remains largely unexplored. We study this problem under online learning, where sellers face uncertain retrieval costs and buyers provide stochastic feedback to posted prices. Three main challenges arise: (1) heterogeneous and partial feedback in configuration learning, (2) variable and complex feedback in pricing learning, and (3) inherent coupling between configuration and pricing decisions. We propose a hierarchical bandit framework that jointly optimizes retrieval configurations and pricing. Stage I employs contextual clustering with confidence-based exploration to learn effective configurations with logarithmic regret. Stage II adopts interval-based price selection with local Taylor approximation to estimate buyer responses and achieve sublinear regret. We establish theoretical guarantees with polynomial time complexity and validate the framework on four real-world datasets, demonstrating consistent improvements in cumulative reward and regret reduction compared with existing methods.

Problem

Research questions and friction points this paper is trying to address.

Optimizing retrieval configurations and pricing in vector data trading

Addressing uncertain retrieval costs and stochastic buyer price feedback

Solving coupling between configuration and pricing decisions efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical bandit framework for joint optimization

Contextual clustering with confidence-based exploration

Interval-based price selection with Taylor approximation

🔎 Similar Papers

When Large Language Models Meet Vector Databases: A Survey