ChopChop: a Programmable Framework for Semantically Constraining the Output of Language Models

📅 2025-08-30

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Language models frequently generate syntactically valid but semantically incorrect code; existing approaches offer only shallow syntactic constraints or brittle semantic encodings. Method: We propose the first systematic semantic-constrained decoding framework that models deep program properties—such as type safety and functional equivalence—as coinductive realizability problems, uniformly solved over regular codata. By jointly leveraging abstract program structure analysis and token-level constrained decoding, the framework tightly integrates large language model outputs with formal verification. Contribution/Results: This work establishes semantic-constrained decoding as a principled, programmable extension of language models—marking the first such formulation. Evaluated across diverse code generation tasks, it achieves substantial improvements in functional correctness while preserving practical decoding efficiency.

Technology Category

Application Category

📝 Abstract

Language models (LMs) can generate code, but cannot guarantee its correctness--producing outputs that often violate type safety, program invariants, or semantic equivalence. Constrained decoding offers a solution by restricting generation to programs that satisfy desired properties. Yet, existing methods are limited to shallow syntactic constraints or rely on brittle, ad hoc encodings of semantics over token sequences. We present ChopChop, the first programmable framework for semantic constrained decoding, enabling LMs to generate code that provably satisfies rich semantic properties. ChopChop connects token-level generation with reasoning over abstract program structures using a coinduction-based formalism and reduces constraint enforcement to a realizability problem over regular codata. We demonstrate ChopChop's generality through generation constrained by type safety and program equivalence, showing how formal methods can be seamlessly integrated into LM-driven code generation. ChopChop transforms semantic constrained decoding from a niche technique into a systematic, principled extension of LMs--improving success rates across models and tasks while maintaining practical decoding latency.

Problem

Research questions and friction points this paper is trying to address.

Ensures code correctness by satisfying semantic properties

Guarantees type safety and program equivalence constraints

Transforms semantic constrained decoding into systematic LM extension

Innovation

Methods, ideas, or system contributions that make the work stand out.

Programmable framework for semantic constrained decoding

Coinduction-based formalism connecting tokens with reasoning

Reduces constraint enforcement to regular codata realizability

🔎 Similar Papers

Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models