Permissive Information-Flow Analysis for Large Language Models

📅 2024-10-04

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

202K/year

🤖 AI Summary

To address the low analytical precision and poor practicality of information-flow analysis in LLM ensemble systems—caused by excessive propagation of input data labels—this paper proposes an influence-driven, dynamically relaxed information-flow labeling mechanism: it retains only those input labels that exert actual causal influence on model outputs, pruning redundant propagation paths. This work is the first to introduce influence analysis into LLM information-flow control, breaking away from traditional fully conservative label propagation paradigms. We design two relaxed propagators—prompt-augmented retrieval and k-nearest-neighbor language modeling—and adopt output-label introspection as a baseline. Experimental evaluation in LLM agent scenarios demonstrates that the relaxed propagators significantly outperform the baseline in over 85% of cases, achieving a favorable trade-off among security guarantees, labeling accuracy, and deployment feasibility.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are rapidly becoming commodity components of larger software systems. This poses natural security and privacy problems: poisoned data retrieved from one component can change the model's behavior and compromise the entire system, including coercing the model to spread confidential data to untrusted components. One promising approach is to tackle this problem at the system level via dynamic information flow (aka taint) tracking. Unfortunately, this approach of propagating the most restrictive input label to the output is too conservative for applications where LLMs operate on inputs retrieved from diverse sources. In this paper, we propose a novel, more permissive approach to propagate information flow labels through LLM queries. The key idea behind our approach is to propagate only the labels of the samples that were influential in generating the model output and to eliminate the labels of unnecessary inputs. We implement and investigate the effectiveness of two variations of this approach, based on (i) prompt-based retrieval augmentation, and (ii) a $k$-nearest-neighbors language model. We compare these with a baseline that uses introspection to predict the output label. Our experimental results in an LLM agent setting show that the permissive label propagator improves over the baseline in more than 85% of the cases, which underscores the practicality of our approach.

Problem

Research questions and friction points this paper is trying to address.

Prevent LLM data poisoning and privacy leaks

Dynamic info flow tracking is overly restrictive

Permissive label propagation for diverse LLM inputs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic permissive label propagation for LLMs

Influential sample-based label filtering

Prompt-based and k-NN language model implementations

🔎 Similar Papers

Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models