LBM: Hierarchical Large Auto-Bidding Model via Reasoning and Acting

📅 2026-03-05

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses the limitations of existing automatic bidding methods, which suffer from poor generalization in dynamic advertising environments due to black-box training and insufficient data coverage, while direct deployment of large language models often leads to hallucinations and suboptimal decisions. To overcome these challenges, the authors propose a hierarchical large model, LBM, comprising LBM-Think for reasoning and LBM-Act for action generation, which jointly optimizes bidding strategies through language-guided reasoning and numerical fusion. A novel dual-embedding mechanism is introduced to effectively align linguistic and numerical modalities, alongside GQPO—an offline reinforcement fine-tuning algorithm that operates without simulation—to suppress hallucinations and enhance decision quality. Experimental results demonstrate that LBM significantly outperforms existing approaches in both training efficiency and generalization, achieving robust and high-performance automated bidding.

Technology Category

Application Category

📝 Abstract

The growing scale of ad auctions on online advertising platforms has intensified competition, making manual bidding impractical and necessitating auto-bidding to help advertisers achieve their economic goals. Current auto-bidding methods have evolved to use offline reinforcement learning or generative methods to optimize bidding strategies, but they can sometimes behave counterintuitively due to the black-box training manner and limited mode coverage of datasets, leading to challenges in understanding task status and generalization in dynamic ad environments. Large language models (LLMs) offer a promising solution by leveraging prior human knowledge and reasoning abilities to improve auto-bidding performance. However, directly applying LLMs to auto-bidding faces difficulties due to the need for precise actions in competitive auctions and the lack of specialized auto-bidding knowledge, which can lead to hallucinations and suboptimal decisions. To address these challenges, we propose a hierarchical Large autoBidding Model (LBM) to leverage the reasoning capabilities of LLMs for developing a superior auto-bidding strategy. This includes a high-level LBM-Think model for reasoning and a low-level LBM-Act model for action generation. Specifically, we propose a dual embedding mechanism to efficiently fuse two modalities, including language and numerical inputs, for language-guided training of the LBM-Act; then, we propose an offline reinforcement fine-tuning technique termed GQPO for mitigating the LLM-Think's hallucinations and enhancing decision-making performance without simulation or real-world rollout like previous multi-turn LLM-based methods. Experiments demonstrate the superiority of a generative backbone based on our LBM, especially in an efficient training manner and generalization ability.

Problem

Research questions and friction points this paper is trying to address.

auto-bidding

large language models

ad auctions

hallucination

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

Auto-bidding

Hierarchical Architecture

Offline Reinforcement Learning

Multimodal Fusion

🔎 Similar Papers

No similar papers found.