Generative Bid Shading in Real-Time Bidding Advertising

📅 2025-08-05

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

In real-time bidding (RTB), existing two-stage bid shading methods suffer from restrictive unimodal assumptions, severe error propagation across stages, and sample selection bias. To address these issues, we propose Generative Bid Shading (GBS), a novel framework that models complex multimodal bid distributions via autoregressive residual generation—eliminating the unimodal constraint. GBS introduces Channel-aware Hierarchical Dynamic Network (CHNet) and Group-wise Relative Policy Optimization (GRPO) to enhance robustness under non-convex remaining-budget curves. Furthermore, an exploration-utility reward alignment mechanism and a residual optimization module jointly optimize short-term bid accuracy and long-term budget equilibrium. Offline evaluations and online A/B tests demonstrate that GBS significantly outperforms state-of-the-art baselines. Deployed in Meituan’s demand-side platform (DSP), GBS processes over 1 billion RTB requests daily, effectively mitigating advertiser overspending risk.

Technology Category

Application Category

📝 Abstract

Bid shading plays a crucial role in Real-Time Bidding~(RTB) by adaptively adjusting the bid to avoid advertisers overspending. Existing mainstream two-stage methods, which first model bid landscapes and then optimize surplus using operations research techniques, are constrained by unimodal assumptions that fail to adapt for non-convex surplus curves and are vulnerable to cascading errors in sequential workflows. Additionally, existing discretization models of continuous values ignore the dependence between discrete intervals, reducing the model's error correction ability, while sample selection bias in bidding scenarios presents further challenges for prediction. To address these issues, this paper introduces Generative Bid Shading~(GBS), which comprises two primary components: (1) an end-to-end generative model that utilizes an autoregressive approach to generate shading ratios by stepwise residuals, capturing complex value dependencies without relying on predefined priors; and (2) a reward preference alignment system, which incorporates a channel-aware hierarchical dynamic network~(CHNet) as the reward model to extract fine-grained features, along with modules for surplus optimization and exploration utility reward alignment, ultimately optimizing both short-term and long-term surplus using group relative policy optimization~(GRPO). Extensive experiments on both offline and online A/B tests validate GBS's effectiveness. Moreover, GBS has been deployed on the Meituan DSP platform, serving billions of bid requests daily.

Problem

Research questions and friction points this paper is trying to address.

Overcoming unimodal assumptions in bid shading models

Addressing discretization errors in continuous value modeling

Mitigating sample selection bias in bidding predictions

Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end generative model for shading ratios

Channel-aware hierarchical dynamic reward model

Group relative policy optimization for surplus

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Research Engineer, Monetization AI