Learning the Boundary of Solvability: Aligning LLMs to Detect Unsolvable Problems

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Large language models (LLMs) frequently conflate logically unsolvable problems with limitations in their own capabilities, leading to hallucinations and unwarranted overconfidence. Method: We introduce UnsolvableQA—a benchmark comprising programmatically generated and reverse-engineered logical contradiction instances—and UnsolvableRL, a reinforcement learning framework incorporating a tripartite reward signal for answer accuracy, unsolvability identification, and difficulty-aware calibration. Contribution/Results: We empirically uncover the “capability collapse” phenomenon: exposure to unsolvable instances significantly mitigates model overconfidence without compromising performance on solvable tasks. Experiments demonstrate near-perfect unsolvability detection (F1 > 0.99) and a +2.3% improvement in accuracy on solvable questions. Our approach jointly enhances model reliability and epistemic caution—enabling robust, self-aware decision-making under uncertainty.

Technology Category

Application Category

📝 Abstract

Ensuring LLM reliability requires not only solving complex problems but also recognizing when a problem is unsolvable. Current models often struggle to distinguish objective unsolvability (inherent contradictions in the problem) from subjective capability limitations (problems beyond the model's competence), which leads to hallucinations and overconfidence. To address this, we propose UnsolvableQA and UnsolvableRL to solve feasible problems, detect inherent contradictions, and prudently refuse tasks beyond capability. Specifically, we construct UnsolvableQA, a dataset of paired solvable and unsolvable instances derived via a dual-track methodology: programmatic generation for logic puzzles and a novel "Reverse Construction" method that injects contradictions into valid reasoning chains for mathematics. Building on this dataset, we introduce UnsolvableRL, a reinforcement learning framework with three reward components jointly accounting for accuracy, unsolvability, and difficulty. Empirical results show that our approach achieves near-perfect unsolvability detection while also improving accuracy on solvable tasks. Crucially, we identify Capability Collapse, demonstrating that explicit exposure to unsolvable data is indispensable for preventing models from becoming systematically overconfident. Our code and data are available at https://github.com/sfasfaffa/unsolvableQA.

Problem

Research questions and friction points this paper is trying to address.

Align LLMs to detect unsolvable problems and distinguish contradictions from capability limits

Propose UnsolvableQA dataset and UnsolvableRL framework to improve detection and accuracy

Prevent systematic overconfidence in models by exposing them to unsolvable data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dataset with paired solvable and unsolvable instances

Reinforcement learning framework with three reward components

Explicit exposure to unsolvable data prevents overconfidence

🔎 Similar Papers

Defining Boundaries: A Spectrum of Task Feasibility for Large Language Models