InconLens: Interactive Visual Diagnosis of Behavioral Inconsistencies in LLM-based Agentic Systems

📅 2026-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of behavioral inconsistency in large language model (LLM) agents, which often exhibit divergent behaviors under identical inputs, hindering reliable deployment. To enable cross-run diagnosis of such inconsistencies, we propose InconLens—a novel visual analytics system that introduces an “information node” abstraction to semantically align and finely compare multi-turn execution trajectories across runs. By integrating structured logging, semantic alignment, and interactive exploration techniques, InconLens offers the first interactive diagnostic framework for investigating behavioral discrepancies in LLM agents. Case studies and expert interviews demonstrate that InconLens effectively pinpoints divergence points, uncovers underlying failure patterns, and significantly enhances the reliability and stability of LLM-based agent systems.
📝 Abstract
Large Language Model (LLM)-based agentic systems have shown growing promise in tackling complex, multi-step tasks through autonomous planning, reasoning, and interaction with external environments. However, the stochastic nature of LLM generation introduces intrinsic behavioral inconsistency: the same agent may succeed in one execution but fail in another under identical inputs. Diagnosing such inconsistencies remains a major challenge for developers, as agent execution logs are often lengthy, unstructured, and difficult to compare across runs. Existing debugging and evaluation tools primarily focus on inspecting single executions, offering limited support for understanding how and why agent behaviors diverge across repeated runs. To address this challenge, we introduce InconLens, a visual analytics system designed to support interactive diagnosis of LLM-based agentic systems with a particular focus on cross-run behavioral analysis. InconLens introduces information nodes as an intermediate abstraction that captures canonical informational milestones shared across executions, enabling semantic alignment and inspection of agent reasoning trajectories across multiple runs. We demonstrate the effectiveness of InconLens through a detailed case study and further validate its usability and analytical value via expert interviews. Our results show that InconLens enables developers to more efficiently identify divergence points, uncover latent failure modes, and gain actionable insights into improving the reliability and stability of agentic systems.
Problem

Research questions and friction points this paper is trying to address.

behavioral inconsistency
LLM-based agentic systems
cross-run diagnosis
execution logs
debugging
Innovation

Methods, ideas, or system contributions that make the work stand out.

visual analytics
behavioral inconsistency
information nodes
LLM-based agentic systems
cross-run analysis
🔎 Similar Papers
No similar papers found.
Shuo Yan
Shuo Yan
University of Texas at Dallas
X
Xiaolin Wen
Nanyang Technological University
S
Shaolun Ruan
Singapore Management University
Y
Yanjie Zhang
Hong Kong University of Science and Technology
J
Jiaming Mi
East China Normal University
Y
Yushi Sun
Hong Kong University of Science and Technology
Huamin Qu
Huamin Qu
Chair Professor, Hong Kong University of Science and Technology
Data visualizationHuman-Computer InteractionExplainable AIE-Learning
R
Rui Sheng
Hong Kong University of Science and Technology