Learning to Guarantee Type Correctness in Code Generation through Type-Guided Program Synthesis

📅 2025-10-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing large language models exhibit weak type inference capabilities in code generation, compromising the type correctness of synthesized programs. To address this, we propose TyFlow—a type-guided program synthesis framework that establishes, for the first time, structural isomorphism between type derivation trees and synthesis derivation trees, thereby internalizing type system complexity into the representation. TyFlow replaces conventional token-level autoregressive generation with synthesis decision sequences, enabling the model to focus on high-level semantic and type constraints. Our approach achieves complete elimination of type errors and significantly improves functional correctness across multiple benchmarks, demonstrating the efficacy of deep synergy between formal type systems and neural language models. The core innovations lie in (1) type-synthesis structural isomorphism modeling and (2) a flow-based, type-aware generation paradigm.

Technology Category

Application Category

📝 Abstract
Language models have shown remarkable proficiency in code generation; nevertheless, ensuring type correctness remains a challenge. Although traditional methods, such as constrained decoding, alleviate this problem by externally rejecting untypable code, the model itself does not effectively learn type reasoning internally, which ultimately limits its overall performance. This paper introduces TyFlow, a novel system that internalizes type reasoning within code generation to guide the model to learn the type system. The core of our approach is a novel type-guided program synthesis system that maintains an isomorphism between type derivation trees and synthesis derivation trees, enabling a new code representation based on synthesis decision sequences rather than traditional text-based token sequences. By offloading the complexity of type system learning to the representation itself, models can redirect their computational resources toward higher-level program semantics. Our evaluation shows that TyFlow not only eliminates type errors but also significantly improves functional correctness, highlighting the importance of aligning LMs with type systems internally.
Problem

Research questions and friction points this paper is trying to address.

Guaranteeing type correctness in neural code generation systems
Internalizing type reasoning to replace external constraint checks
Improving functional correctness through type-guided program synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Type-guided program synthesis for code generation
Isomorphism between type and synthesis derivation trees
Code representation using synthesis decision sequences
🔎 Similar Papers
No similar papers found.
Z
Zhechong Huang
Peking University, China
Z
Zhao Zhang
Peking University, China
Ruyi Ji
Ruyi Ji
University of Michigan
Program Synthesis
T
Tingxuan Xia
Peking University, China
Qihao Zhu
Qihao Zhu
Peking University
software engineering
Q
Qinxiang Cao
Shanghai Jiao Tong University, China
Z
Zeyu Sun
Institute of Software, Chinese Academy of Sciences, China
Yingfei Xiong
Yingfei Xiong
Associate Professor, Peking University
Software EngineeringProgramming LanguagesProgram RepairProgram SynthesisProgram Analysis