Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation

📅 2025-02-18

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This paper addresses two key challenges in code generation—a prototypical System 2 reasoning task—for large language models (LLMs): (1) difficulty in modeling implicit, complex reasoning chains, and (2) poor generalization and robustness stemming from heterogeneous data distributions. To this end, we propose the BDC framework, featuring three novel components: (1) MC-Tree-Of-Agents, which integrates Monte Carlo tree search, reflective pruning, and multi-model mutual enhancement to enable verifiable, collaborative reasoning; (2) DisenLora, a disentanglement-based method that decomposes heterogeneous training data and constructs a composable LoRA expert library; and (3) an input-aware hypernetwork that dynamically weights and ensembles expert solvers. Evaluated on HumanEval, MBPP, and cross-domain transfer benchmarks, BDC achieves state-of-the-art performance—significantly improving accuracy and robustness under few-shot, multi-distribution, and adversarial perturbation settings.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in various domains, particularly in system 1 tasks, yet the intricacies of their problem-solving mechanisms in system 2 tasks are not sufficiently explored. Recent research on System2-to-System1 methods surge, exploring the System 2 reasoning knowledge via inference-time computation and compressing the explored knowledge into System 1 process. In this paper, we focus on code generation, which is a representative System 2 task, and identify two primary challenges: (1) the complex hidden reasoning processes and (2) the heterogeneous data distributions that complicate the exploration and training of robust LLM solvers. To tackle these issues, we propose a novel BDC framework that explores insightful System 2 knowledge of LLMs using a MC-Tree-Of-Agents algorithm with mutual extbf{B}oosting, extbf{D}isentangles the heterogeneous training data for composable LoRA-experts, and obtain extbf{C}ustomized problem solver for each data instance with an input-aware hypernetwork to weight over the LoRA-experts, offering effectiveness, flexibility, and robustness. This framework leverages multiple LLMs through mutual verification and boosting, integrated into a Monte-Carlo Tree Search process enhanced by reflection-based pruning and refinement. Additionally, we introduce the DisenLora algorithm, which clusters heterogeneous data to fine-tune LLMs into composable Lora experts, enabling the adaptive generation of customized problem solvers through an input-aware hypernetwork. This work lays the groundwork for advancing LLM capabilities in complex reasoning tasks, offering a novel System2-to-System1 solution.

Problem

Research questions and friction points this paper is trying to address.

Explores System 2 reasoning for code generation.

Addresses complex hidden reasoning processes.

Manages heterogeneous data distributions effectively.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mutual Boosting algorithm

Disentangle LoRA-experts

Customized hypernetwork solver

🔎 Similar Papers

No similar papers found.