SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration

📅 2026-03-21

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses critical safety concerns in centralized agent skill markets, where semantic-behavioral misalignment and combinatorial interactions can lead to cascading risks—high-download skills are not necessarily secure. To mitigate these issues, the paper proposes a multi-agent collaborative auditing framework grounded in the principle of “auditing skills with skills.” The approach employs standardized skill modules to orchestrate specialized agents that perform staged evaluations: initial admission filtering, semantic-behavior alignment verification, and compositional risk simulation. Integrating large language models, behavioral simulation, graph-based relational analysis, and modular encapsulation, the system uncovers that high-risk skills form a giant connected component across relational dimensions, revealing systemic vulnerability. Evaluation on 2,500 real-world skills shows over 90% of high-popularity skills fail the audit, demonstrating no positive correlation between popularity and safety. An interactive platform is open-sourced at skillhub.holosai.io.

Technology Category

Application Category

📝 Abstract

With the rapid evolution of Large Language Model (LLM) agent ecosystems, centralized skill marketplaces have emerged as pivotal infrastructure for augmenting agent capabilities. However, these marketplaces face unprecedented security challenges, primarily stemming from semantic-behavioral inconsistency and inter-skill combinatorial risks, where individually benign skills induce malicious behaviors during collaborative invocation. To address these vulnerabilities, we propose SkillProbe, a multi-stage security auditing framework driven by multi-agent collaboration. SkillProbe introduces a "Skills-for-Skills" design paradigm, encapsulating auditing processes into standardized skill modules to drive specialized agents through a rigorous pipeline, including admission filtering, semantic-behavioral alignment detection, and combinatorial risk simulation. We conducted a large-scale evaluation using 8 mainstream LLM series across 2,500 real-world skills from ClawHub. Our results reveal a striking popularity-security paradox, where download volume is not a reliable proxy for security quality, as over 90% of high-popularity skills failed to pass rigorous auditing. Crucially, we discovered that high-risk skills form a single giant connected component within the risk-link dimension, demonstrating that cascaded risks are systemic rather than isolated occurrences. We hope that SkillProbe will inspire researchers to provide a scalable governance infrastructure for constructing a trustworthy Agentic Web. SkillProbe is accessible for public experience at skillhub.holosai.io.

Problem

Research questions and friction points this paper is trying to address.

security auditing

agent skill marketplaces

semantic-behavioral inconsistency

combinatorial risk

multi-agent collaboration

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent collaboration

skill marketplace security

semantic-behavioral inconsistency