Wait-Less Offline Tuning and Re-solving for Online Decision Making

📅 2024-12-12

📈 Citations: 1

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Online linear programming (OLP) faces a fundamental trade-off between computational efficiency and regret minimization. Method: We propose a synergistic framework combining periodic LP re-solving with parallel first-order optimization. Our “wait-free” architecture periodically invokes an offline LP solver to obtain high-precision dual prices, while concurrently performing parallel gradient updates to smooth resource consumption and mitigate decision latency; dual price migration and periodic synchronization ensure theoretical consistency. Contribution/Results: We establish the first regret bound of $O(log(T/f) + sqrt{f})$ achievable in nearly linear time—significantly improving upon traditional LP-based methods (high computational overhead) and pure first-order approaches (large regret). Experiments demonstrate strong scalability and efficacy in large-scale, real-time revenue management and dynamic resource allocation tasks.

Technology Category

Application Category

📝 Abstract

Online linear programming (OLP) has found broad applications in revenue management and resource allocation. State-of-the-art OLP algorithms achieve low regret by repeatedly solving linear programming (LP) subproblems that incorporate updated resource information. However, LP-based methods are computationally expensive and often inefficient for large-scale applications. In contrast, recent first-order OLP algorithms are more computationally efficient but typically suffer from worse regret guarantees. To address these shortcomings, we propose a new algorithm that combines the strengths of LP-based and first-order OLP methods. The algorithm re-solves the LP subproblems periodically at a predefined frequency $f$ and uses the latest dual prices to guide online decision-making. In addition, a first-order method runs in parallel during each interval between LP re-solves, smoothing resource consumption. Our algorithm achieves $mathscr{O}(log (T/f) + sqrt{f})$ regret, delivering a"wait-less"online decision-making process that balances the computational efficiency of first-order methods and the superior regret guarantee of LP-based methods.

Problem

Research questions and friction points this paper is trying to address.

Online Linear Programming

Computational Efficiency

Regret Guarantees

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online Linear Programming

Dual Prices

First-Order Methods

🔎 Similar Papers

No similar papers found.

TikTok

San Jose / Seattle

2026 Fall Applied Science Internship - Reinforcement Learning & Optimization (Machine Learning) - United States, PhD Student Science Recruiting

Amazon

Arlington, VA, USA / Bellevue, WA, USA / Boston, MA, USA

Authors to Follow