Enhancing Interpretability in Generative AI Through Search-Based Data Influence Analysis

📅 2025-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Generative AI models suffer from limited output interpretability due to their black-box nature, posing trust and compliance risks—particularly in art and copyright-sensitive domains. To address this, we propose a search-driven data influence attribution method that reverse-traces the dependency of generated outputs on training data—including both raw samples and latent-space embeddings—enabling output-oriented interpretability analysis. Unlike conventional gradient- or perturbation-based approaches, our method innovatively anchors attribution at the generation outcome and unifies influence assessment across both original data and latent representations. It employs efficient search optimization coupled with local retraining for rigorous validation, enabling precise identification of critical training subsets. Experiments demonstrate strong cross-model generalization and significantly enhance the feasibility and reliability of expert-guided interpretability evaluation.

Technology Category

Application Category

📝 Abstract
Generative AI models offer powerful capabilities but often lack transparency, making it difficult to interpret their output. This is critical in cases involving artistic or copyrighted content. This work introduces a search-inspired approach to improve the interpretability of these models by analysing the influence of training data on their outputs. Our method provides observational interpretability by focusing on a model's output rather than on its internal state. We consider both raw data and latent-space embeddings when searching for the influence of data items in generated content. We evaluate our method by retraining models locally and by demonstrating the method's ability to uncover influential subsets in the training data. This work lays the groundwork for future extensions, including user-based evaluations with domain experts, which is expected to improve observational interpretability further.
Problem

Research questions and friction points this paper is trying to address.

Improving interpretability of Generative AI models
Analyzing training data influence on outputs
Enhancing transparency for artistic and copyrighted content
Innovation

Methods, ideas, or system contributions that make the work stand out.

Search-based data influence analysis
Observational interpretability via output focus
Combines raw data and latent-space embeddings
🔎 Similar Papers
No similar papers found.
T
Theodoros Aivalis
University of Glasgow, UK; National Centre for Scientific Research “Demokritos”, Greece
Iraklis A. Klampanos
Iraklis A. Klampanos
National Centre for Scientific Research "Demokritos"
data scienceartificial intelligencedeep learningbig datadata-intensive computing
Antonis Troumpoukis
Antonis Troumpoukis
NCSR "Demokritos"
J
Joemon M. Jose
University of Glasgow, UK