Generative Bid Shading in Real-Time Bidding Advertising

📅 2025-08-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In real-time bidding (RTB), existing two-stage bid shading methods suffer from restrictive unimodal assumptions, severe error propagation across stages, and sample selection bias. To address these issues, we propose Generative Bid Shading (GBS), a novel framework that models complex multimodal bid distributions via autoregressive residual generation—eliminating the unimodal constraint. GBS introduces Channel-aware Hierarchical Dynamic Network (CHNet) and Group-wise Relative Policy Optimization (GRPO) to enhance robustness under non-convex remaining-budget curves. Furthermore, an exploration-utility reward alignment mechanism and a residual optimization module jointly optimize short-term bid accuracy and long-term budget equilibrium. Offline evaluations and online A/B tests demonstrate that GBS significantly outperforms state-of-the-art baselines. Deployed in Meituan’s demand-side platform (DSP), GBS processes over 1 billion RTB requests daily, effectively mitigating advertiser overspending risk.

Technology Category

Application Category

📝 Abstract
Bid shading plays a crucial role in Real-Time Bidding~(RTB) by adaptively adjusting the bid to avoid advertisers overspending. Existing mainstream two-stage methods, which first model bid landscapes and then optimize surplus using operations research techniques, are constrained by unimodal assumptions that fail to adapt for non-convex surplus curves and are vulnerable to cascading errors in sequential workflows. Additionally, existing discretization models of continuous values ignore the dependence between discrete intervals, reducing the model's error correction ability, while sample selection bias in bidding scenarios presents further challenges for prediction. To address these issues, this paper introduces Generative Bid Shading~(GBS), which comprises two primary components: (1) an end-to-end generative model that utilizes an autoregressive approach to generate shading ratios by stepwise residuals, capturing complex value dependencies without relying on predefined priors; and (2) a reward preference alignment system, which incorporates a channel-aware hierarchical dynamic network~(CHNet) as the reward model to extract fine-grained features, along with modules for surplus optimization and exploration utility reward alignment, ultimately optimizing both short-term and long-term surplus using group relative policy optimization~(GRPO). Extensive experiments on both offline and online A/B tests validate GBS's effectiveness. Moreover, GBS has been deployed on the Meituan DSP platform, serving billions of bid requests daily.
Problem

Research questions and friction points this paper is trying to address.

Overcoming unimodal assumptions in bid shading models
Addressing discretization errors in continuous value modeling
Mitigating sample selection bias in bidding predictions
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end generative model for shading ratios
Channel-aware hierarchical dynamic reward model
Group relative policy optimization for surplus
🔎 Similar Papers
No similar papers found.
Y
Yinqiu Huang
Meituan, Chengdu, China
H
Hao Ma
Chongqing University, Chongqing, China
W
Wenshuai Chen
Meituan, Chengdu, China
S
Shuli Wang
Meituan, Chengdu, China
Yongqiang Zhang
Yongqiang Zhang
Distinguished Professor, Institute of Geographic Sciences and Natural Resources Research, CAS
evapotranspirationhydrologyremote sensingclimate changewater resources
X
Xue Wei
Meituan, Chengdu, China
Y
Yinhua Zhu
Meituan, Chengdu, China
H
Haitao Wang
Meituan, Chengdu, China
X
Xingxing Wang
Meituan, Beijing, China