Adaptive Confidence Gating in Multi-Agent Collaboration for Efficient and Optimized Code Generation

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

This work addresses the limitations of small language models in complex logical code generation, where constrained reasoning capabilities often lead to error loops and a trade-off between efficiency and accuracy. To overcome this, we propose DebateCoder, a multi-agent collaborative framework that orchestrates three specialized agents—User, Technical, and Quality Assurance—through an adaptive confidence-gated mechanism (95% threshold), orthogonal pre-generation debate, and a post-generation debugging loop. This synergistic approach enables efficient and reliable cooperative reasoning. Evaluated on HumanEval, DebateCoder achieves a 70.12% Pass@1 score, outperforming MapCoder while reducing API costs by approximately 35%, thereby significantly enhancing code generation performance under resource-constrained conditions.

Technology Category

Application Category

📝 Abstract

While Large Language Models (LLMs) have catalyzed breakthroughs in automated code generation, Small Language Models (SLMs) often encounter reasoning bottlenecks and failure loops when addressing complex logical requirements. To overcome these challenges, we propose DebateCoder, a multi-agent collaborative framework designed to improve the reasoning ability of SLMs (e.g., Pangu-1B) in resource-constrained environments. DebateCoder uses a structured role-playing protocol with three agents: User Agent (A_UA), Technical Agent (A_TA), and Quality Assurance Agent (A_QA). It also includes an Adaptive Confidence Gating mechanism with a 95% threshold to balance accuracy and inference efficiency. In addition, we introduce a multi-turn deliberation module and a reviewer-guided analytical debugging loop for orthogonal pre-generation debate and post-generation refinement. Experiments on HumanEval and MBPP show that DebateCoder achieves 70.12% Pass@1 on HumanEval, outperforming MapCoder while reducing API overhead by about 35%. These results indicate that collaborative protocols can mitigate limitations of small-parameter models and provide a scalable, efficient approach to high-quality automated software engineering.

Problem

Research questions and friction points this paper is trying to address.

Small Language Models

reasoning bottlenecks

code generation

multi-agent collaboration

resource-constrained environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Confidence Gating

Multi-Agent Collaboration

Small Language Models