MultiPriv: Benchmarking Individual-Level Privacy Reasoning in Vision-Language Models

📅 2025-11-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing privacy benchmarks focus solely on attribute recognition and fail to capture the emerging risk of individual-level privacy inference by vision-language models (VLMs) via cross-modal information association. Method: We introduce MultiPriv—the first systematic benchmark for evaluating VLMs’ privacy inference capabilities—comprising nine tasks spanning attribute identification, cross-image re-identification, and multi-step inference. We propose the Privacy Perception and Reasoning (PPR) framework and curate a bilingual, multimodal dataset featuring synthetic personal profiles to enable fine-grained, end-to-end evaluation. Contribution/Results: Empirical evaluation of 50+ models reveals that conventional perception-based metrics cannot predict inference-level privacy risks, and mainstream safety alignment techniques offer limited protection against such attacks. This work establishes a new standard and empirical foundation for quantifying privacy risks and enhancing the security of VLMs.

Technology Category

Application Category

📝 Abstract
Modern Vision-Language Models (VLMs) demonstrate sophisticated reasoning, escalating privacy risks beyond simple attribute perception to individual-level linkage. Current privacy benchmarks are structurally insufficient for this new threat, as they primarily evaluate privacy perception while failing to address the more critical risk of privacy reasoning: a VLM's ability to infer and link distributed information to construct individual profiles. To address this critical gap, we propose extbf{MultiPriv}, the first benchmark designed to systematically evaluate individual-level privacy reasoning in VLMs. We introduce the extbf{Privacy Perception and Reasoning (PPR)} framework and construct a novel, bilingual multimodal dataset to support it. The dataset uniquely features a core component of synthetic individual profiles where identifiers (e.g., faces, names) are meticulously linked to sensitive attributes. This design enables nine challenging tasks evaluating the full PPR spectrum, from attribute detection to cross-image re-identification and chained inference. We conduct a large-scale evaluation of over 50 foundational and commercial VLMs. Our analysis reveals: (1) Many VLMs possess significant, unmeasured reasoning-based privacy risks. (2) Perception-level metrics are poor predictors of these reasoning risks, revealing a critical evaluation gap. (3) Existing safety alignments are inconsistent and ineffective against such reasoning-based attacks. MultiPriv exposes systemic vulnerabilities and provides the necessary framework for developing robust, privacy-preserving VLMs.
Problem

Research questions and friction points this paper is trying to address.

Evaluating individual-level privacy reasoning risks in Vision-Language Models
Addressing limitations of current benchmarks in measuring privacy inference capabilities
Assessing VLMs' ability to link distributed information for profile reconstruction
Innovation

Methods, ideas, or system contributions that make the work stand out.

MultiPriv benchmark evaluates privacy reasoning in VLMs
PPR framework uses synthetic individual profiles with linked identifiers
Dataset supports nine tasks from detection to cross-image inference
🔎 Similar Papers
2024-05-27arXiv.orgCitations: 1
X
Xiongtao Sun
Xidian University, Nanyang Technological University
H
Hui Li
Xidian University
J
Jiaming Zhang
Nanyang Technological University
Y
Yujie Yang
Xidian University
K
Kaili Liu
Xidian University
R
Ruxin Feng
Xidian University
W
Wen Jun Tan
Nanyang Technological University
Wei Yang Bryan Lim
Wei Yang Bryan Lim
Assistant Professor, Nanyang Technological University (NTU), Singapore
Edge IntelligenceFederated LearningApplied AISustainable AI