🤖 AI Summary
To address the challenges posed by C’s file-based compilation—resulting in incomplete programs—and the inability of existing Andersen-style pointer analyses to simultaneously ensure precision and efficiency, this paper proposes an efficient, scalable pointer analysis for incomplete programs. Our approach introduces two key innovations: (1) an implicit pointee tracking mechanism based on constraint graphs, which explicitly models cross-module accessible memory locations; and (2) the Prefer Implicit Pointees (PIP) optimization, which reduces redundant explicit pointer representations while preserving semantic completeness and lowering computational overhead. Experimental evaluation demonstrates that our constraint-solving engine achieves a 15× speedup over baseline approaches; PIP further accelerates analysis by 1.9×. Moreover, the may-alias false positive rate decreases by 40%, and memory consumption remains tightly bounded. These results confirm significant and synergistic improvements in precision, efficiency, and scalability.
📝 Abstract
Compiling files individually lends itself well to parallelization, but forces the compiler to operate on incomplete programs. State-of-the-art points-to analyses guarantee sound solutions only for complete programs, requiring summary functions to describe any missing program parts. Summary functions are rarely available in production compilers, however, where soundness and efficiency are non-negotiable. This paper presents an Andersen-style points-to analysis that efficiently produces sound solutions for incomplete C programs. The analysis accomplishes soundness by tracking memory locations and pointers that are accessible from external modules, and efficiency by performing this tracking implicitly in the constraint graph. We show that implicit pointee tracking makes the constraint solver 15$ imes$ faster than any combination of five different state-of-the-art techniques using explicit pointee tracking. We also present the Prefer Implicit Pointees (PIP) technique that further reduces the use of explicit pointees. PIP gives an additional speedup of 1.9$ imes$, compared to the fastest solver configuration not benefiting from PIP. The precision of the analysis is evaluated in terms of an alias-analysis client, where it reduces the number of MayAlias-responses by 40% compared to LLVM's BasicAA pass alone. Finally, we show that the analysis is scalable in terms of memory, making it suitable for optimizing compilers in practice.