🤖 AI Summary
This work addresses the challenge of online sequential subsidy allocation in large-scale ride-hailing markets by proposing a hierarchical diffusion model framework. The approach leverages prefix-conditioned diffusion to generate future supply-demand trajectories and employs a context-conditioned inverse decoding module to produce city-level subsidy signals. Crucially, Lagrangian dual mapping embeds subsidy rate constraints directly into dispatch incentives, enabling low-latency, fine-grained, and iteration-free compliant decision-making. Innovatively adapting diffusion models to city-scale dynamic control, the method introduces a prefix-conditioning mechanism that bridges the offline–online gap and supports multi-city transferability. Both offline evaluations and real-world A/B tests demonstrate that the proposed framework significantly improves completed ride volume and gross merchandise value (GMV) while strictly adhering to subsidy budget constraints.
📝 Abstract
Ride-hailing platforms like DiDi Chuxing operate in highly dynamic environments where balancing driver supply and passenger demand is critical. Although driver-side subsidies serve as a primary lever to align these forces and improve key KPIs like completed rides (\texttt{Rides}) and gross merchandise value (\texttt{GMV}), optimizing them in production requires simultaneously meeting three constraints: (i) responsiveness to stochastic shocks, (ii) strict subsidy-rate caps, and (iii) low-latency execution at city scale. These requirements rule out expensive per-order optimization, calling for a forward-looking, constraint-aware city-level controller for online sequential decision making. To meet these requirements, we introduce D$^3$-Subsidy (Dynamic Driver-side Diffusion-based Subsidy), a hierarchical diffusion-based framework for deployable city-wide subsidy control. To bridge the train-inference gap, D$^3$-Subsidy employs a prefix-conditioned diffusion model that samples plausible future trajectories from immutable historical observations, ensuring the training protocol aligns with the fixed-history nature of online deployment. These generated plans are then decoded by a context-conditioned inverse module into low-dimensional city-level control signals. For scalable execution, we bridge the gap between city-level planning and fine-grained dispatch via a Lagrangian-dual-derived mapping, which embeds subsidy-rate caps directly into order-driver incentives without iterative optimization. Additionally, a multi-city pretraining strategy with parameter-efficient fine-tuning enables robust transfer across heterogeneous cities. Extensive offline evaluations demonstrate that D$^3$-Subsidy improves \texttt{Rides} and \texttt{GMV} while enhancing cap compliance, and a real-world A/B test confirms significant uplift while keeping budget-related violation metrics within operational thresholds.