Clarifying Before Reasoning: A Coq Prover with Structural Context

📅 2025-07-03

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Ambiguous task formulations hinder large language models’ (LLMs) reasoning performance in Coq theorem proving. Method: To enhance task clarity, we propose (i) structured semantic context modeling, (ii) a selective concept expansion mechanism, and (iii) a Planner–Executor two-stage reasoning architecture; we further design a concept-level clarity metric for quantitative evaluation. Using DeepSeek-V3 as the base model, we integrate context augmentation and lightweight fine-tuning. Results: Evaluated on 1,386 theorems, our approach increases task clarity by 1.85× and boosts proof success rate from 21.8% to 45.8%, surpassing the prior SOTA Graph2Tac (33.2%). Our core contribution is the first systematic quantification and optimization of task clarity in formal theorem proving—establishing an interpretable, optimization-friendly paradigm for LLM-based formal reasoning.

Technology Category

Application Category

📝 Abstract

In this work, we investigate whether improving task clarity can enhance reasoning ability of large language models, focusing on theorem proving in Coq. We introduce a concept-level metric to evaluate task clarity and show that adding structured semantic context to the standard input used by modern LLMs, leads to a 1.85$ imes$ improvement in clarity score (44.5%~$ ightarrow$~82.3%). Using the general-purpose model exttt{DeepSeek-V3}, our approach leads to a 2.1$ imes$ improvement in proof success (21.8%~$ ightarrow$~45.8%) and outperforms the previous state-of-the-art exttt{Graph2Tac} (33.2%). We evaluate this on 1,386 theorems randomly sampled from 15 standard Coq packages, following the same evaluation protocol as exttt{Graph2Tac}. Furthermore, fine-tuning smaller models on our structured data can achieve even higher performance (48.6%). Our method uses selective concept unfolding to enrich task descriptions, and employs a Planner--Executor architecture. These findings highlight the value of structured task representations in bridging the gap between understanding and reasoning.

Problem

Research questions and friction points this paper is trying to address.

Improving task clarity enhances reasoning in Coq theorem proving

Structured semantic context boosts LLM proof success rates

Selective concept unfolding improves understanding and reasoning performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adding structured semantic context to LLMs

Using selective concept unfolding for clarity

Employing Planner-Executor architecture for reasoning

🔎 Similar Papers

miniCTX: Neural Theorem Proving with (Long-)Contexts