TypePro: Boosting LLM-Based Type Inference via Inter-Procedural Slicing

πŸ“… 2026-04-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of limited type inference accuracy in dynamically typed languages due to the absence of type annotations. The authors propose a novel approach that integrates large language models with interprocedural program slicing: by extracting cross-function contextual information through slicing, the method constructs structured candidate complex types, effectively compensating for the model’s insufficient domain knowledge and enhancing contextual awareness. This is the first study to synergistically combine interprocedural slicing with large language models for type inference. Evaluated on the ManyTypes4Py and ManyTypes4TypeScript datasets, the approach achieves Top-1 exact match accuracies of 88.9% and 86.6%, respectively, representing improvements of 7.1 and 10.3 percentage points over the current state-of-the-art methods.
πŸ“ Abstract
Dynamic languages (such as Python and JavaScript) offer flexibility and simplified type handling for programming, but this can also lead to an increase in type-related errors and additional overhead for compile-time type inference. As a result, type inference for dynamic languages has become a popular research area. Existing approaches typically achieve type inference through static analysis, machine learning, or large language models (LLMs). However, current work only focuses on the direct dependencies of variables related to type inference as the context, resulting in incomplete contextual information and thus affecting the accuracy of type inference. To address this issue, this paper proposes a method called TypePro, which leverages LLMs for type inference in dynamic languages. TypePro supplements contextual information by conducting inter-procedural code slicing. Then, TypePro proposes a set of candidate complex types based on the structural information of data types implied in the slices, thereby addressing the lack of domain knowledge of LLMs. We conducted experiments on the ManyTypes4Py and ManyTypes4TypeScript datasets, achieving Top-1 exact match (EM) rates of 88.9% and 86.6%, respectively. Notably, TypePro improves the Top-1 Exact Match by 7.1 and 10.3 percentage points over the second-best approach, showing the effectiveness and robustness of TypePro.
Problem

Research questions and friction points this paper is trying to address.

type inference
dynamic languages
contextual information
inter-procedural slicing
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Type Inference
Large Language Models
Inter-Procedural Slicing
Dynamic Languages
Code Context Enhancement
πŸ”Ž Similar Papers
No similar papers found.
T
Teyu Lin
Xiamen University, China
M
Minghao Fan
Xiamen University, China
H
Huaxun Huang
Xiamen University, China
Zhirong Shen
Zhirong Shen
Xiamen University
storage systemsstorage dependabilityerasure coding
Rongxin Wu
Rongxin Wu
Xiamen University
software securityprogram analysissoftware engineering