Counterfactual-based Agent Influence Ranker for Agentic AI Workflows

📅 2025-10-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods only perform static structural analysis of Agentic AI Workflows (AAWs), failing to quantify the dynamic influence of individual agents on final outputs during reasoning. To address this, we propose CAIR—a task-agnostic, interpretable framework for dynamic influence ranking—introducing counterfactual analysis to AAW attribution for the first time. CAIR enables fine-grained, causally aware influence assessment via counterfactual interventions and multi-agent collaborative trajectory modeling, supporting both offline and inference-time analysis modes. Evaluated on a comprehensive, self-constructed AAW benchmark spanning 30 task categories and 230 functional capabilities, CAIR significantly outperforms baseline methods in influence ranking consistency, downstream task performance, and output–goal relevance. Its causal grounding and operational flexibility establish a new standard for transparent, accountable agentic workflow analysis.

Technology Category

Application Category

📝 Abstract
An Agentic AI Workflow (AAW), also known as an LLM-based multi-agent system, is an autonomous system that assembles several LLM-based agents to work collaboratively towards a shared goal. The high autonomy, widespread adoption, and growing interest in such AAWs highlight the need for a deeper understanding of their operations, from both quality and security aspects. To this day, there are no existing methods to assess the influence of each agent on the AAW's final output. Adopting techniques from related fields is not feasible since existing methods perform only static structural analysis, which is unsuitable for inference time execution. We present Counterfactual-based Agent Influence Ranker (CAIR) - the first method for assessing the influence level of each agent on the AAW's output and determining which agents are the most influential. By performing counterfactual analysis, CAIR provides a task-agnostic analysis that can be used both offline and at inference time. We evaluate CAIR using an AAWs dataset of our creation, containing 30 different use cases with 230 different functionalities. Our evaluation showed that CAIR produces consistent rankings, outperforms baseline methods, and can easily enhance the effectiveness and relevancy of downstream tasks.
Problem

Research questions and friction points this paper is trying to address.

Assessing individual agent influence in AI workflows
Overcoming limitations of static structural analysis methods
Providing task-agnostic influence ranking for collaborative agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

Counterfactual analysis for agent influence assessment
Task-agnostic evaluation usable at inference time
Determines most influential agents in AI workflows