🤖 AI Summary
Manually crafting program postconditions is time-consuming and error-prone, while existing single-prompt approaches leveraging large language models (LLMs) often yield suboptimal results. This work proposes SpecMind, a novel framework that treats the LLM as an interactive reasoner, iteratively refining postconditions through a cognitively inspired, multi-turn feedback mechanism that integrates both explicit and implicit correctness signals. Crucially, SpecMind autonomously determines when to terminate the refinement process based on its assessment of optimality. Experimental evaluation demonstrates that this approach significantly outperforms state-of-the-art methods in both the accuracy and completeness of the generated postconditions.
📝 Abstract
Specifications are vital for ensuring program correctness, yet writing them manually remains challenging and time-intensive. Recent large language model (LLM)-based methods have shown successes in generating specifications such as postconditions, but existing single-pass prompting often yields inaccurate results. In this paper, we present SpecMind, a novel framework for postcondition generation that treats LLMs as interactive and exploratory reasoners rather than one-shot generators. SpecMind employs feedback-driven multi-turn prompting approaches, enabling the model to iteratively refine candidate postconditions by incorporating implicit and explicit correctness feedback, while autonomously deciding when to stop. This process fosters deeper code comprehension and improves alignment with true program behavior via exploratory attempts. Our empirical evaluation shows that SpecMind significantly outperforms state-of-the-art approaches in both accuracy and completeness of generated postconditions.