Infrequent Resolving Algorithm for Online Linear Programming

📅 2024-08-01
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Online linear programming (OLP) faces a fundamental trade-off between computational efficiency and theoretical performance—frequent LP re-solving ensures accuracy but incurs high overhead, while infrequent updates risk suboptimal decisions. Method: We propose a sparse resolution framework that solves the primal LP exactly only at $O(log log T)$ critical time points and performs lightweight first-order updates elsewhere. The approach integrates stochastic programming modeling, piecewise adaptive decision rules, and online learning under unknown finite-support distributions. Contribution/Results: This is the first OLP algorithm achieving a constant regret bound under unknown finite-support distributions, while reducing LP solves to $O(log log T)$. We further introduce an adjustable-frequency infrequent-resolving framework: with $M$ total LP solves, it attains asymptotically optimal regret $Oig(T^{(1/2)^{M-1}}ig)$. Experiments demonstrate consistent superiority over state-of-the-art LP-based and LP-free baselines across diverse problem instances.

Technology Category

Application Category

📝 Abstract
Online linear programming (OLP) has gained significant attention from both researchers and practitioners due to its extensive applications, such as online auction, network revenue management, order fulfillment and advertising. Existing OLP algorithms fall into two categories: LP-based algorithms and LP-free algorithms. The former one typically guarantees better performance, even offering a constant regret, but requires solving a large number of LPs, which could be computationally expensive. In contrast, LP-free algorithm only requires first-order computations but induces a worse performance, lacking a constant regret bound. In this work, we bridge the gap between these two extremes by proposing a well-performing algorithm, that solves LPs at a few selected time points and conducts first-order computations at other time points. Specifically, for the case where the inputs are drawn from an unknown finite-support distribution, the proposed algorithm achieves a constant regret (even for the hard ``degenerate'' case) while solving LPs only $mathcal{O}(loglog T)$ times over the time horizon $T$. Moreover, when we are allowed to solve LPs only $M$ times, we design the corresponding schedule such that the proposed algorithm can guarantee a nearly $mathcal{O}left(T^{(1/2)^{M-1}} ight)$ regret. Our work highlights the value of resolving both at the beginning and the end of the selling horizon, and provides a novel framework to prove the performance guarantee of the proposed policy under different infrequent resolving schedules. Furthermore, when the arrival probabilities are known at the beginning, our algorithm can guarantee a constant regret by solving LPs $mathcal{O}(loglog T)$ times, and a nearly $mathcal{O}left(T^{(1/2)^{M}} ight)$ regret by solving LPs only $M$ times. Numerical experiments are conducted to demonstrate the efficiency of the proposed algorithms.
Problem

Research questions and friction points this paper is trying to address.

Linear Programming
Network Problems
Efficient Algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Online Linear Programming
Regret Minimization
Efficiency Optimization
🔎 Similar Papers
No similar papers found.
G
Guokai Li
School of Data Science, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P.R. China
Zizhuo Wang
Zizhuo Wang
The Chinese University of Hong Kong, Shenzhen / Cardinal Operations
Operations ResearchOptimizationOperations ManagementRevenue Management
J
Jingwei Zhang
School of Data Science, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P.R. China