InTraVisTo: Inside Transformer Visualisation Tool

📅 2025-07-18

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Large language models (LLMs) suffer from limited interpretability in production deployment, particularly regarding the inability to trace internal reasoning mechanisms at the token-by-token generation level. Method: We propose a fine-grained computational tracing method tailored for Transformer architectures. It employs inter-layer embedding decoding to map hidden states into semantically interpretable representations, integrates multi-layer attention path tracing to identify critical information flows, and introduces Sankey diagrams to visualize cross-layer, cross-head, and cross-token information propagation—without modifying model architecture or relying on gradients. The approach supports dynamic attribution analysis for arbitrary feedforward or autoregressive generation steps. Contribution/Results: Experiments demonstrate that our framework substantially enhances understanding of internal decision pathways in LLMs. It provides a reproducible, interactive interpretability tool for behavioral diagnostics, bias溯源 (bias root-cause analysis), and controllable generation—advancing transparency and trustworthiness in LLM deployment.

Technology Category

Application Category

📝 Abstract

The reasoning capabilities of Large Language Models (LLMs) have increased greatly over the last few years, as have their size and complexity. Nonetheless, the use of LLMs in production remains challenging due to their unpredictable nature and discrepancies that can exist between their desired behavior and their actual model output. In this paper, we introduce a new tool, InTraVisTo (Inside Transformer Visualisation Tool), designed to enable researchers to investigate and trace the computational process that generates each token in a Transformer-based LLM. InTraVisTo provides a visualization of both the internal state of the Transformer model (by decoding token embeddings at each layer of the model) and the information flow between the various components across the different layers of the model (using a Sankey diagram). With InTraVisTo, we aim to help researchers and practitioners better understand the computations being performed within the Transformer model and thus to shed some light on internal patterns and reasoning processes employed by LLMs.

Problem

Research questions and friction points this paper is trying to address.

Visualize internal state of Transformer-based LLMs

Trace computational process generating each token

Understand information flow across model layers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Visualizes internal Transformer model states

Traces token generation computational process

Uses Sankey diagrams for information flow

🔎 Similar Papers

No similar papers found.