Membership Inference Attacks Against Video Large Language Models

📅 2026-04-29

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This study investigates whether external auditors can determine, through black-box access, whether a given video was part of the training data for a video large language model (VLLM), thereby assessing its privacy risks. To this end, the authors propose a novel membership inference attack that uniquely integrates temperature-perturbed generation, semantic drift measurement, and video perceptual difficulty features into a unified black-box attack framework tailored for VLLMs. Experimental evaluations on LLaVA-Video-7B and Qwen2-Video-Only demonstrate the effectiveness of the approach, achieving an AUC of 0.68 and an accuracy of 0.63, respectively. These results reveal significant privacy leakage risks associated with the training data of current video large language models.

📝 Abstract

Video large language models (VideoLLMs) are increasingly trained or instruction-tuned on large-scale video--text corpora collected from heterogeneous sources, raising an immediate privacy question: can an external auditor determine whether a particular video was used during training? While membership inference attacks (MIAs) have been studied extensively for classifiers and, more recently, for text and image generation models, the VideoLLM setting remains unexplored. This setting is challenging because black-box auditors observe only generated text, whereas the membership signal is entangled with video-specific factors such as motion complexity and temporal span. In this paper, we present a black-box MIA targeting VideoLLMs that couples temperature-perturbed generation with video-aware difficulty features. Our key intuition is that member samples tend to induce sharper, more brittle generation behavior across decoding temperatures, and that this signal should be interpreted jointly with the intrinsic difficulty of the queried video. Concretely, we query the target model at low and high temperatures, measure the semantic drift between the resulting texts. We evaluate the attack against \texttt{LLaVA-Video-7B-Qwen2-Video-Only} and achieve a member inference AUC of 0.68 and accuracy of 0.63. These results demonstrate that Video-LLMs are vulnerable to black-box membership inference attacks, highlighting an urgent need for the community to systematically evaluate and mitigate privacy risks in VideoLLMs.

Problem

Research questions and friction points this paper is trying to address.

Membership Inference Attacks

Video Large Language Models

Privacy

Black-box Auditing

Training Data Membership

Innovation

Methods, ideas, or system contributions that make the work stand out.

Membership Inference Attack

Video Large Language Models

Black-box Attack