🤖 AI Summary
This paper addresses two key challenges in data selection: (1) the lack of a unified theoretical foundation for data value modeling, and (2) high computational complexity. To tackle these, we propose a sequential decision-making framework that formally casts data selection as a dynamic programming problem—yielding a principled definition of optimal data value. This formulation unifies and interprets existing methods (e.g., Data Shapley) and establishes theoretical optimality guarantees for greedy selection under monotone submodular utility. Furthermore, we design a bipartite graph neural network to learn a surrogate utility function and integrate it with approximate dynamic programming for scalable inference. Extensive experiments across diverse datasets demonstrate substantial improvements in selection quality. Our approach bridges rigorous theoretical guarantees with practical scalability, offering a novel paradigm for quantifying data value and enabling efficient, principled data selection.
📝 Abstract
Data selection has emerged as a crucial downstream application of data valuation. While existing data valuation methods have shown promise in selection tasks, the theoretical foundations and full potential of using data values for selection remain largely unexplored. In this work, we first demonstrate that data values applied for selection can be naturally reformulated as a sequential-decision-making problem, where the optimal data value can be derived through dynamic programming. We show this framework unifies and reinterprets existing methods like Data Shapley through the lens of approximate dynamic programming, specifically as myopic reward function approximations to this sequential problem. Furthermore, we analyze how sequential data selection optimality is affected when the ground-truth utility function exhibits monotonic submodularity with curvature. To address the computational challenges in obtaining optimal data values, we propose an efficient approximation scheme using learned bipartite graphs as surrogate utility models, ensuring greedy selection is still optimal when the surrogate utility is correctly specified and learned. Extensive experiments demonstrate the effectiveness of our approach across diverse datasets.