Uncertainty-Guided Chain-of-Thought for Code Generation with LLMs

📅 2025-03-19

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing large language models (LLMs) suffer from “overthinking” when applying Chain-of-Thought (CoT) reasoning for code generation—introducing computational redundancy and degrading accuracy on simple tasks. To address this, we propose an uncertainty-aware dynamic CoT inference mechanism that, for the first time, jointly quantifies uncertainty using both entropy and probability gap metrics, embedding them directly into the decoding process to enable conditional CoT triggering and adaptive skipping. Our method eliminates unnecessary reasoning for low-difficulty tasks, improving PassRate by 6.1% on the MHPP benchmark while significantly enhancing correctness on high-difficulty problems and reducing average token consumption. The core contribution is the introduction of an uncertainty-driven CoT scheduling paradigm—the first end-to-end integration of uncertainty modeling with code generation decoding.

Technology Category

Application Category

📝 Abstract

Chain-of-Thought (CoT) reasoning has been demonstrated as an effective technique for improving the problem-solving capabilities of large language models (LLMs) in the context of code generation. However, existing CoT methods often exhibit a tendency toward"overthinking", where the LLM consistently applies reasoning strategies without adequately considering the task's underlying complexity. This results in the LLMs allocating excessive computational resources, in terms of tokens, to relatively simple tasks or problems where the correct answer is already evident. Additionally, this overthinking may lead LLMs down incorrect reasoning paths, resulting in incorrect code generation. In this paper, we introduce UnCertainty-Aware Chain-of-Thought (UnCert-CoT), an LLM-based approach designed to enhance code generation by incorporating an uncertainty-aware CoT reasoning mechanism, which focuses computational resources on targeting points where LLMs are more prone to error. We propose two confidence-based uncertainty measures: Entropy-based and Probability Differential-based methods. When uncertainty is high, UnCert-CoT activates CoT-decoding to generate multiple reasoning paths and selects the final code that exhibits the highest likelihood of correctness. In contrast, LLM directly generates the code when uncertainty is low. This uncertainty judgment mechanism allows LLMs to prioritize complex tasks and avoid unnecessary steps in simpler cases, thereby improving overall efficiency and accuracy in code generation. Our experimental results demonstrate that UnCert-CoT significantly enhances code generation accuracy on challenging benchmark MHPP(Mostly Hard Python Problems), it achieves improvements up to 6.1% on PassRate accuracy, particularly in situations where traditional LLMs are prone to errors.

Problem

Research questions and friction points this paper is trying to address.

Addresses overthinking in LLMs during code generation

Introduces uncertainty-aware reasoning to improve accuracy

Enhances efficiency by focusing on error-prone tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uncertainty-aware Chain-of-Thought reasoning mechanism

Entropy-based and Probability Differential uncertainty measures

Dynamic CoT-decoding activation based on uncertainty levels

🔎 Similar Papers

What Makes Large Language Models Reason in (Multi-Turn) Code Generation?