Constraint-Aware Generative Auto-bidding via Pareto-Prioritized Regret Optimization

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that existing automated bidding methods struggle to maximize marketing value under strict efficiency constraints—such as target cost-per-acquisition (CPA)—due to state confounding and over-reliance on imitating historical average behaviors. To overcome these limitations, the authors propose PRO-Bid, a novel framework that decouples constraint satisfaction from utility optimization. PRO-Bid leverages Pareto representation to recover resource awareness, reweights high-performing trajectories, and employs counterfactual regret minimization to actively steer the policy toward the optimal constraint boundary. Built upon the Decision Transformer architecture, the framework integrates Pareto-front reweighting, recursive cost-value contextual modeling, and outcome-predictive guidance into a unified weighted regression objective. Experiments demonstrate that PRO-Bid significantly outperforms current approaches on two public benchmarks and in live A/B tests, achieving state-of-the-art performance in both constraint adherence and marketing value.

Technology Category

Application Category

📝 Abstract
Auto-bidding systems aim to maximize marketing value while satisfying strict efficiency constraints such as Target Cost-Per-Action (CPA). Although Decision Transformers provide powerful sequence modeling capabilities, applying them to this constrained setting encounters two challenges: 1) standard Return-to-Go conditioning causes state aliasing by neglecting the cost dimension, preventing precise resource pacing; and 2) standard regression forces the policy to mimic average historical behaviors, thereby limiting the capacity to optimize performance toward the constraint boundary. To address these challenges, we propose PRO-Bid, a constraint-aware generative auto-bidding framework based on two synergistic mechanisms: 1) Constraint-Decoupled Pareto Representation (CDPR) decomposes global constraints into recursive cost and value contexts to restore resource perception, while reweighting trajectories based on the Pareto frontier to focus on high-efficiency data; and 2) Counterfactual Regret Optimization (CRO) facilitates active improvement by utilizing a global outcome predictor to identify superior counterfactual actions. By treating these high-utility outcomes as weighted regression targets, the model transcends historical averages to approach the optimal constraint boundary. Extensive experiments on two public benchmarks and online A/B tests demonstrate that PRO-Bid achieves superior constraint satisfaction and value acquisition compared to state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

auto-bidding
constraint satisfaction
Target CPA
resource pacing
performance optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constraint-Aware Bidding
Pareto Representation
Counterfactual Regret Optimization
Decision Transformer
Auto-bidding
🔎 Similar Papers
No similar papers found.
B
Binglin Wu
Dalian University of Technology
Yingyi Zhang
Yingyi Zhang
Bytedance
Content UnderstandingMLLMComputer VisionPalmprint RecognitionPose Estimation
X
Xianneng Li
Dalian University of Technology
R
Ruyue Deng
Alibaba International Digital Commerce Group
Chuan Yue
Chuan Yue
Professor of Computer Science, Colorado School of Mines
Web SecurityWeb BrowsersUsable Security and PrivacyWeb-based SystemsAI System Security
W
Weiru Zhang
Alibaba International Digital Commerce Group
X
Xiaoyi Zeng
Alibaba International Digital Commerce Group