Agent2Agent Threats in Safety-Critical LLM Assistants: A Human-Centric Taxonomy

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This work addresses security threats in LLM-powered intelligent cockpit assistants arising from natural language payloads in Agent-to-Agent (A2A) communication. Existing security frameworks conflate assets with attack paths and lack a clear separation of concerns for safety-critical systems. To overcome these limitations, the authors propose AgentHeLLM, a novel threat modeling framework that introduces a human-centric asset taxonomy grounded in the Universal Declaration of Human Rights. The framework formally distinguishes data poisoning paths from trigger paths using a graph-based model and employs a two-layer search strategy combined with natural language payload analysis to automatically uncover multi-stage composite threats. The accompanying open-source tool, AgentHeLLM Attack Path Generator, significantly enhances the security assessment capabilities for in-vehicle LLM assistants.

Technology Category

Application Category

📝 Abstract

The integration of Large Language Model (LLM)-based conversational agents into vehicles creates novel security challenges at the intersection of agentic AI, automotive safety, and inter-agent communication. As these intelligent assistants coordinate with external services via protocols such as Google's Agent-to-Agent (A2A), they establish attack surfaces where manipulations can propagate through natural language payloads, potentially causing severe consequences ranging from driver distraction to unauthorized vehicle control. Existing AI security frameworks, while foundational, lack the rigorous"separation of concerns"standard in safety-critical systems engineering by co-mingling the concepts of what is being protected (assets) with how it is attacked (attack paths). This paper addresses this methodological gap by proposing a threat modeling framework called AgentHeLLM (Agent Hazard Exploration for LLM Assistants) that formally separates asset identification from attack path analysis. We introduce a human-centric asset taxonomy derived from harm-oriented"victim modeling"and inspired by the Universal Declaration of Human Rights, and a formal graph-based model that distinguishes poison paths (malicious data propagation) from trigger paths (activation actions). We demonstrate the framework's practical applicability through an open-source attack path suggestion tool AgentHeLLM Attack Path Generator that automates multi-stage threat discovery using a bi-level search strategy.

Problem

Research questions and friction points this paper is trying to address.

Agent2Agent threats

safety-critical systems

LLM assistants

threat modeling

human-centric security

Innovation

Methods, ideas, or system contributions that make the work stand out.

AgentHeLLM

human-centric taxonomy

attack path modeling