Permissive Information-Flow Analysis for Large Language Models

📅 2024-10-04
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the low analytical precision and poor practicality of information-flow analysis in LLM ensemble systems—caused by excessive propagation of input data labels—this paper proposes an influence-driven, dynamically relaxed information-flow labeling mechanism: it retains only those input labels that exert actual causal influence on model outputs, pruning redundant propagation paths. This work is the first to introduce influence analysis into LLM information-flow control, breaking away from traditional fully conservative label propagation paradigms. We design two relaxed propagators—prompt-augmented retrieval and k-nearest-neighbor language modeling—and adopt output-label introspection as a baseline. Experimental evaluation in LLM agent scenarios demonstrates that the relaxed propagators significantly outperform the baseline in over 85% of cases, achieving a favorable trade-off among security guarantees, labeling accuracy, and deployment feasibility.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are rapidly becoming commodity components of larger software systems. This poses natural security and privacy problems: poisoned data retrieved from one component can change the model's behavior and compromise the entire system, including coercing the model to spread confidential data to untrusted components. One promising approach is to tackle this problem at the system level via dynamic information flow (aka taint) tracking. Unfortunately, this approach of propagating the most restrictive input label to the output is too conservative for applications where LLMs operate on inputs retrieved from diverse sources. In this paper, we propose a novel, more permissive approach to propagate information flow labels through LLM queries. The key idea behind our approach is to propagate only the labels of the samples that were influential in generating the model output and to eliminate the labels of unnecessary inputs. We implement and investigate the effectiveness of two variations of this approach, based on (i) prompt-based retrieval augmentation, and (ii) a $k$-nearest-neighbors language model. We compare these with a baseline that uses introspection to predict the output label. Our experimental results in an LLM agent setting show that the permissive label propagator improves over the baseline in more than 85% of the cases, which underscores the practicality of our approach.
Problem

Research questions and friction points this paper is trying to address.

Prevent LLM data poisoning and privacy leaks
Dynamic info flow tracking is overly restrictive
Permissive label propagation for diverse LLM inputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic permissive label propagation for LLMs
Influential sample-based label filtering
Prompt-based and k-NN language model implementations