🤖 AI Summary
This work identifies a novel software supply chain threat wherein large language model–driven coding agents can be induced to introduce malicious dependencies through tampering with persisted skill files. The paper proposes “Dependency Steering,” an attack paradigm that performs semantic edits at the skill level to steer agents toward generating import statements for specific malicious packages during routine tasks—without altering model weights, training data, or user prompts. By leveraging skill optimization algorithms to identify minimal, function-preserving modifications that increase the likelihood of target package generation, the method achieves high rates of targeted hallucination across multiple mainstream code-generation models and benchmarks. It demonstrates strong cross-model and cross-task transferability and evades detection by current skill scanners and LLM auditing tools.
📝 Abstract
LLM-powered coding agents increasingly make software supply chain decisions. They generate imports, recommend packages, and write installation commands. Prior work showed that these systems can hallucinate non-existent package names, which attackers may register as malicious packages. In this paper, we show that this risk is not only a passive model failure. It can be actively induced through the persistent Skill artifact. We introduce Dependency Steering, an attack paradigm in which a malicious Skill biases a coding agent toward an attacker-controlled package during benign coding tasks. The attack does not require modifying model weights, training data, or user prompts. To construct realistic attacks, we design a Skill-level optimization method that searches for localized semantic edits that preserve the apparent purpose of the original Skill while increasing targeted package generation. Across multiple coding-oriented LLMs and programming benchmarks, Dependency Steering achieves high targeted hallucination rates, transfers across models and task domains, and remains difficult for evaluated Skill scanners and LLM-based auditors to detect. Our results show that persistent agent instructions form an underexplored software supply chain attack surface.