Learning to Guarantee Type Correctness in Code Generation through Type-Guided Program Synthesis

📅 2025-10-11

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing large language models exhibit weak type inference capabilities in code generation, compromising the type correctness of synthesized programs. To address this, we propose TyFlow—a type-guided program synthesis framework that establishes, for the first time, structural isomorphism between type derivation trees and synthesis derivation trees, thereby internalizing type system complexity into the representation. TyFlow replaces conventional token-level autoregressive generation with synthesis decision sequences, enabling the model to focus on high-level semantic and type constraints. Our approach achieves complete elimination of type errors and significantly improves functional correctness across multiple benchmarks, demonstrating the efficacy of deep synergy between formal type systems and neural language models. The core innovations lie in (1) type-synthesis structural isomorphism modeling and (2) a flow-based, type-aware generation paradigm.

Technology Category

Application Category

📝 Abstract

Language models have shown remarkable proficiency in code generation; nevertheless, ensuring type correctness remains a challenge. Although traditional methods, such as constrained decoding, alleviate this problem by externally rejecting untypable code, the model itself does not effectively learn type reasoning internally, which ultimately limits its overall performance. This paper introduces TyFlow, a novel system that internalizes type reasoning within code generation to guide the model to learn the type system. The core of our approach is a novel type-guided program synthesis system that maintains an isomorphism between type derivation trees and synthesis derivation trees, enabling a new code representation based on synthesis decision sequences rather than traditional text-based token sequences. By offloading the complexity of type system learning to the representation itself, models can redirect their computational resources toward higher-level program semantics. Our evaluation shows that TyFlow not only eliminates type errors but also significantly improves functional correctness, highlighting the importance of aligning LMs with type systems internally.

Problem

Research questions and friction points this paper is trying to address.

Guaranteeing type correctness in neural code generation systems

Internalizing type reasoning to replace external constraint checks

Improving functional correctness through type-guided program synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Type-guided program synthesis for code generation

Isomorphism between type and synthesis derivation trees

Code representation using synthesis decision sequences

🔎 Similar Papers

Enhancing Repository-Level Code Generation with Integrated Contextual Information