Weighted Set Multi-Cover on Bounded Universe and Applications in Package Recommendation

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the problem of efficient and cost-effective subset selection from large candidate sets in data-driven scenarios by studying the Weighted Set Multi-Cover problem under a bounded universe (WSMC-BU). The paper presents the first polynomial-time exact algorithm for this problem and introduces two efficient approximation algorithms achieving approximation ratios of 2 and (2+ε), respectively, significantly outperforming conventional greedy and LP-rounding approaches. Leveraging dynamic programming, linear programming, and rounding techniques, the proposed methods consistently surpass existing baselines in both solution quality and computational efficiency across real-world and synthetic datasets, demonstrating strong applicability to large-scale data selection tasks such as recommendation systems and decision support.

Technology Category

Application Category

📝 Abstract
The weighted set multi-cover problem is a fundamental generalization of set cover that arises in data-driven applications where one must select a small, low-cost subset from a large collection of candidates under coverage constraints. In data management settings, such problems arise naturally either as expressive database queries or as post-processing steps over query results, for example, when selecting representative or diverse subsets from large relations returned by database queries for decision support, recommendation, fairness-aware data selection, or crowd-sourcing. While the general weighted set multi-cover problem is NP-complete, many practical workloads involve a \emph{bounded universe} of items that must be covered, leading to the Weighted Set Multi-Cover with Bounded Universe (WSMC-BU) problem, where the universe size is constant. In this paper, we develop exact and approximation algorithms for WSMC-BU. We first discuss a dynamic programming algorithm that solves WSMC-BU exactly in $O(n^{\ell+1})$ time, where $n$ is the number of input sets and $\ell=O(1)$ is the universe size. We then present a $2$-approximation algorithm based on linear programming and rounding, running in $O(\mathcal{L}(n))$ time, where $\mathcal{L}(n)$ denotes the complexity of solving a linear program with $O(n)$ variables. To further improve efficiency for large datasets, we propose a faster $(2+\varepsilon)$-approximation algorithm with running time $O(n \log n + \mathcal{L}(\log W))$, where $W$ is the ratio of the total weight to the minimum weight, and $\varepsilon$ is an arbitrary constant specified by the user. Extensive experiments on real and synthetic datasets demonstrate that our methods consistently outperform greedy and standard LP-rounding baselines in both solution quality and runtime, making them suitable for data-intensive selection tasks over large query outputs.
Problem

Research questions and friction points this paper is trying to address.

Weighted Set Multi-Cover
Bounded Universe
Package Recommendation
Coverage Constraints
Data-Driven Selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Weighted Set Multi-Cover
Bounded Universe
Dynamic Programming
Linear Programming Rounding
Approximation Algorithm
🔎 Similar Papers
No similar papers found.