Identifying and Mitigating API Misuse in Large Language Models

πŸ“… 2025-03-28
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
API misuse in code generated by large language models (LLMs) is prevalent and often leads to functional defects and security vulnerabilities. Method: This paper introduces the first taxonomy of LLM-specific API misuse, comprising four categories, and proposes Dr.Fixβ€”an intent-aligned, automated repair framework. Dr.Fix departs from conventional rule-based or static-analysis approaches by integrating large-scale human annotation, misuse pattern mining, and LLM-powered program repair. Contribution/Results: Evaluated on Python and Java across multiple LLMs (StarCoder, GitHub Copilot), Dr.Fix achieves a 40-percentage-point improvement in repair accuracy and up to +38.4 in BLEU score over state-of-the-art baselines. This work establishes the first systematic, LLM-oriented API misuse classification benchmark and delivers a scalable, intent-aware repair solution for enhancing LLM-generated code safety and reliability.

Technology Category

Application Category

πŸ“ Abstract
API misuse in code generated by large language models (LLMs) represents a serious emerging challenge in software development. While LLMs have demonstrated impressive code generation capabilities, their interactions with complex library APIs remain highly prone to errors, potentially leading to software failures and security vulnerabilities. This paper presents the first comprehensive study of API misuse patterns in LLM-generated code, analyzing both method selection and parameter usage across Python and Java. Through extensive manual annotation of 3,892 method-level and 2,560 parameter-level misuses, we develop a novel taxonomy of four distinct API misuse types specific to LLMs, which significantly differ from traditional human-centric misuse patterns. Our evaluation of two widely used LLMs, StarCoder-7B (open-source) and Copilot (closed-source), reveals significant challenges in API usage, particularly in areas of hallucination and intent misalignment. We propose Dr.Fix, a novel LLM-based automatic program repair approach for API misuse based on the aforementioned taxonomy. Our method substantially improves repair accuracy for real-world API misuse, demonstrated by increases of up to 38.4 points in BLEU scores and 40 percentage points in exact match rates across different models and programming languages. This work provides crucial insights into the limitations of current LLMs in API usage and presents an effective solution for the automated repair of API misuse in LLM-generated code.
Problem

Research questions and friction points this paper is trying to address.

Study API misuse patterns in LLM-generated code
Identify unique LLM-specific API misuse types
Develop automated repair for LLM API misuse
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel taxonomy for LLM-specific API misuse
LLM-based automatic repair tool Dr.Fix
Improves repair accuracy significantly
πŸ”Ž Similar Papers
No similar papers found.