The Importance of Parameters in Ranking Functions

📅 2026-01-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work investigates how to quantify the influence of column weights in tabular data on tuple ranking outcomes to explain the behavior of ranking functions. Building upon the SHAP explanation framework and leveraging Shapley value theory alongside computational complexity analysis, the study systematically examines ranking functions based on lexicographic order or aggregate scores (e.g., sum, min, max), combined with global, top-k, and local effect functions. The primary contribution lies in the first complete characterization of the computational complexity of exactly computing SHAP scores across these settings: some cases admit polynomial-time algorithms, while others are #P-hard. Furthermore, the paper establishes that an additive fully polynomial-time randomized approximation scheme (FPRAS) exists for all cases, providing theoretical guarantees for scalable approximate explanations.

Technology Category

Application Category

📝 Abstract

How important is the weight of a given column in determining the ranking of tuples in a table? To address such an explanation question about a ranking function, we investigate the computation of SHAP scores for column weights, adopting a recent framework by Grohe et al.[ICDT'24]. The exact definition of this score depends on three key components: (1) the ranking function in use, (2) an effect function that quantifies the impact of using alternative weights on the ranking, and (3) an underlying weight distribution. We analyze the computational complexity of different instantiations of this framework for a range of fundamental ranking and effect functions, focusing on probabilistically independent finite distributions for individual columns. For the ranking functions, we examine lexicographic orders and score-based orders defined by the summation, minimum, and maximum functions. For the effect functions, we consider global, top-k, and local perspectives: global measures quantify the divergence between the perturbed and original rankings, top-k measures inspect the change in the set of top-k answers, and local measures capture the impact on an individual tuple of interest. Although all cases admit an additive fully polynomial-time randomized approximation scheme (FPRAS), we establish the complexity of exact computation, identifying which cases are solvable in polynomial time and which are #P-hard. We further show that all complexity results, lower bounds and upper bounds, extend to a related task of computing the Shapley value of whole columns (regardless of their weight).

Problem

Research questions and friction points this paper is trying to address.

ranking functions

SHAP scores

computational complexity

column weights

Shapley value

Innovation

Methods, ideas, or system contributions that make the work stand out.

SHAP scores

ranking functions

computational complexity