π€ AI Summary
This work addresses the challenge of limited type inference accuracy in dynamically typed languages due to the absence of type annotations. The authors propose a novel approach that integrates large language models with interprocedural program slicing: by extracting cross-function contextual information through slicing, the method constructs structured candidate complex types, effectively compensating for the modelβs insufficient domain knowledge and enhancing contextual awareness. This is the first study to synergistically combine interprocedural slicing with large language models for type inference. Evaluated on the ManyTypes4Py and ManyTypes4TypeScript datasets, the approach achieves Top-1 exact match accuracies of 88.9% and 86.6%, respectively, representing improvements of 7.1 and 10.3 percentage points over the current state-of-the-art methods.
π Abstract
Dynamic languages (such as Python and JavaScript) offer flexibility and simplified type handling for programming, but this can also lead to an increase in type-related errors and additional overhead for compile-time type inference. As a result, type inference for dynamic languages has become a popular research area. Existing approaches typically achieve type inference through static analysis, machine learning, or large language models (LLMs). However, current work only focuses on the direct dependencies of variables related to type inference as the context, resulting in incomplete contextual information and thus affecting the accuracy of type inference. To address this issue, this paper proposes a method called TypePro, which leverages LLMs for type inference in dynamic languages. TypePro supplements contextual information by conducting inter-procedural code slicing. Then, TypePro proposes a set of candidate complex types based on the structural information of data types implied in the slices, thereby addressing the lack of domain knowledge of LLMs. We conducted experiments on the ManyTypes4Py and ManyTypes4TypeScript datasets, achieving Top-1 exact match (EM) rates of 88.9% and 86.6%, respectively. Notably, TypePro improves the Top-1 Exact Match by 7.1 and 10.3 percentage points over the second-best approach, showing the effectiveness and robustness of TypePro.