PackMonitor: Enabling Zero Package Hallucinations Through Decoding-Time Monitoring

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the critical issue of “package hallucination” in large language models (LLMs) used for dependency recommendation, where models erroneously suggest non-existent software packages, posing significant security risks. The authors propose a training-free, plug-and-play decoding-time intervention mechanism that strictly constrains generated installation commands to authoritative package registries. By integrating a context-aware parser, a package name intervenor, and a DFA-based cache, the method ensures that all recommended packages are valid and existent. This approach achieves, for the first time, complete elimination of package hallucinations—reducing hallucination rates to zero across five mainstream LLMs—while maintaining low inference latency and preserving the models’ original capabilities, thereby offering a robust solution that balances security, accuracy, and efficiency.

Technology Category

Application Category

📝 Abstract

As Large Language Models (LLMs) are increasingly integrated into software development workflows, their trustworthiness has become a critical concern. However, in dependency recommendation scenarios, the reliability of LLMs is undermined by widespread package hallucinations, where models often recommend hallucinated packages. Recent studies have proposed a range of approaches to mitigate this issue. Nevertheless, existing approaches typically merely reduce hallucination rates rather than eliminate them, leaving persistent software security risks. In this work, we argue that package hallucinations are theoretically preventable based on the key insight that package validity is decidable through finite and enumerable authoritative package lists. Building on this, we propose PackMonitor, the first approach capable of fundamentally eliminating package hallucinations by continuously monitoring the model's decoding process and intervening when necessary. To implement this in practice, PackMonitor addresses three key challenges: (1) determining when to trigger intervention via a Context-Aware Parser that continuously monitors model outputs and selectively activates intervening only during installation command generation; (2) resolving how to intervene by employing a Package-Name Intervenor that strictly limits the decoding space to an authoritative package list; and (3) ensuring monitoring efficiency through a DFA-Caching Mechanism that enables scalability to millions of packages with negligible overhead. Extensive experiments on five widely used LLMs demonstrate that PackMonitor is a training-free, plug-and-play solution that consistently reduces package hallucination rates to zero while maintaining low-latency inference and preserving original model capabilities.

Problem

Research questions and friction points this paper is trying to address.

package hallucination

Large Language Models

dependency recommendation

software security

trustworthiness

Innovation

Methods, ideas, or system contributions that make the work stand out.

package hallucination

decoding-time monitoring

authoritative package list