View-oriented Conversation Compiler for Agent Trace Analysis

📅 2026-03-31

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the limitations of unstructured agent interaction logs (e.g., in JSONL format), which significantly hinder in-context learning and trajectory analysis. To overcome this, the authors introduce the View-oriented Conversation Compiler (VCC)—the first application of compiler technology to agent trajectory analysis—which transforms raw logs into three structured views: complete, user-interface, and adaptive. VCC leverages a classical compilation pipeline comprising lexical analysis, syntactic parsing, intermediate representation, and code generation. Experiments on AppWorld demonstrate that simply replacing the raw input to the reflexion module with VCC-generated views consistently improves task success rates across three model configurations, while reducing reflexion token consumption by 50%–67% and yielding more concise and effective memory representations. This study establishes message formatting as a critical infrastructure for in-context learning, rather than a mere engineering detail.

Technology Category

Application Category

📝 Abstract

Agent traces carry increasing analytical value in the era of context learning and harness-driven agentic cognition, yet most prior work treats conversation format as a trivial engineering detail. Modern agent conversations contain deeply structured content, including nested tool calls and results, chain-of-thought reasoning blocks, sub-agent invocations, context-window compaction boundaries, and harness-injected system directives, whose complexity far exceeds that of simple user-assistant exchanges. Feeding such traces to a reflector or other analytical mechanism in plain text, JSON, YAML, or via grep can materially degrade analysis quality. This paper presents VCC (View-oriented Conversation Compiler), a compiler (lex, parse, IR, lower, emit) that transforms raw agent JSONL logs into a family of structured views: a full view (lossless transcript serving as the canonical line-number coordinate system), a user-interface view (reconstructing the interaction as the user actually perceived it), and an adaptive view (a structure-preserving projection governed by a relevance predicate). In a context-learning experiment on AppWorld, replacing only the reflector's input format, from raw JSONL to VCC-compiled views, leads to higher pass rates across all three model configurations tested, while cutting reflector token consumption by half to two-thirds and producing more concise learned memory. These results suggest that message format functions as infrastructure for context learning, not as an incidental implementation choice.

Problem

Research questions and friction points this paper is trying to address.

agent trace analysis

conversation format

structured content

context learning

message representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

View-oriented Compilation

Agent Trace Analysis

Structured Conversation Views