Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

📅 2026-04-05

📈 Citations: 0

✨ Influential: 0

career value

246K/year

🤖 AI Summary

Existing prompt learning methods struggle to efficiently leverage multi-agent trajectories in highly parallel settings and often suffer from degraded performance due to imbalanced training dynamics. This work proposes Combee, a novel framework that integrates parallel scan algorithms, an enhanced shuffling mechanism, and a dynamic batch size controller to simultaneously preserve prompt learning quality and substantially improve training efficiency. Combee is the first to enable high-quality, highly efficient large-scale parallel prompt learning with agent-coordinated self-improvement. Experimental results demonstrate that Combee achieves up to 17× speedup over prior methods on benchmarks including AppWorld, Terminal-Bench, Formula, and FiNER, while matching or exceeding their accuracy—all without increasing computational cost.

Technology Category

Application Category

📝 Abstract

Recent advances in prompt learning allow large language model agents to acquire task-relevant knowledge from inference-time context without parameter changes. For example, existing methods (like ACE or GEPA) can learn system prompts to improve accuracy based on previous agent runs. However, these methods primarily focus on single-agent or low-parallelism settings. This fundamentally limits their ability to efficiently learn from a large set of collected agentic traces. It would be efficient and beneficial to run prompt learning in parallel to accommodate the growing trend of learning from many agentic traces or parallel agent executions. Yet without a principled strategy for scaling, current methods suffer from quality degradation with high parallelism. To improve both the efficiency and quality of prompt learning, we propose Combee, a novel framework to scale parallel prompt learning for self-improving agents. Combee speeds up learning and enables running many agents in parallel while learning from their aggregate traces without quality degradation. To achieve this, Combee leverages parallel scans and employs an augmented shuffle mechanism; Combee also introduces a dynamic batch size controller to balance quality and delay. Evaluations on AppWorld, Terminal-Bench, Formula, and FiNER demonstrate that Combee achieves up to 17x speedup over previous methods with comparable or better accuracy and equivalent cost.

Problem

Research questions and friction points this paper is trying to address.

prompt learning

language model agents

parallelism

self-improving agents

agentic traces

Innovation

Methods, ideas, or system contributions that make the work stand out.

parallel prompt learning

self-improving agents

agentic traces