LAMeD: LLM-generated Annotations for Memory Leak Detection

📅 2025-05-05

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

Existing static analyzers for memory leak detection suffer from low efficiency and poor scalability due to heavy reliance on manual function labeling as “sources” or “sinks.” To address this, this paper proposes the first LLM-based automated function-level semantic labeling method. Leveraging prompt engineering and fine-grained function behavior understanding, the approach enables large language models to generate source/sink annotations directly consumable by industrial-grade analyzers (e.g., Cooddy), eliminating manual effort and supporting zero-shot generalization to third-party libraries. By incorporating semantically grounded labels, the method mitigates path explosion and enhances data-flow tracking precision. Evaluated on real-world C/C++ projects, it achieves a 37% increase in memory leak detection rate, a 29% reduction in false positives, and reduces annotation time from hours to seconds. This work establishes a new, efficient, scalable, and semantics-driven paradigm for static analysis.

Technology Category

Application Category

📝 Abstract

Static analysis tools are widely used to detect software bugs and vulnerabilities but often struggle with scalability and efficiency in complex codebases. Traditional approaches rely on manually crafted annotations -- labeling functions as sources or sinks -- to track data flows, e.g., ensuring that allocated memory is eventually freed, and code analysis tools such as CodeQL, Infer, or Cooddy can use function specifications, but manual annotation is laborious and error-prone, especially for large or third-party libraries. We present LAMeD (LLM-generated Annotations for Memory leak Detection), a novel approach that leverages large language models (LLMs) to automatically generate function-specific annotations. When integrated with analyzers such as Cooddy, LAMeD significantly improves memory leak detection and reduces path explosion. We also suggest directions for extending LAMeD to broader code analysis.

Problem

Research questions and friction points this paper is trying to address.

Automating memory leak detection annotations using LLMs

Reducing manual effort in static analysis annotation

Improving scalability of code analysis tools

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs to generate function annotations

Improves memory leak detection accuracy

Reduces path explosion in static analysis

🔎 Similar Papers

Is The Watermarking Of LLM-Generated Code Robust?