Privacy-Aware Visual Language Models

📅 2024-05-27

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Vision-language models (VLMs) exhibit poor recognition accuracy for visual privacy content—such as passports and fingerprints—and existing evaluation datasets suffer from inconsistent labeling, hindering rigorous privacy-safety assessment and optimization. Method: We introduce PrivBench, the first benchmark dedicated to visual privacy understanding, and construct PrivTune, a lightweight instruction-tuning dataset. Leveraging models like TinyLLaVa and MiniGPT-v2, we propose a privacy-aware few-shot instruction-tuning paradigm that preserves general vision-language capabilities (e.g., VQA) while enhancing privacy-sensitive image recognition. Contribution/Results: Our approach achieves state-of-the-art performance on PrivBench—surpassing GPT-4V—without degrading general-purpose capabilities. This work establishes the first systematic evaluation standard and efficient adaptation framework for visual privacy in VLMs, providing foundational support for privacy-aware VLM research.

Technology Category

Application Category

📝 Abstract

This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life. To this end, we introduce a new benchmark PrivBench, which contains images from 8 sensitive categories such as passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this benchmark and observe a generally limited understanding of privacy, highlighting a significant area for model improvement. Based on this we introduce PrivTune, a new instruction-tuning dataset aimed at equipping VLMs with knowledge about visual privacy. By tuning two pretrained VLMs, TinyLLaVa and MiniGPT-v2, on this small dataset, we achieve strong gains in their ability to recognize sensitive content, outperforming even GPT4-V. At the same time, we show that privacy-tuning only minimally affects the VLMs performance on standard benchmarks such as VQA. Overall, this paper lays out a crucial challenge for making VLMs effective in handling real-world data safely and provides a simple recipe that takes the first step towards building privacy-aware VLMs.

Problem

Research questions and friction points this paper is trying to address.

VLMs lack understanding of visual privacy

Existing datasets have inconsistent privacy labels

Need minimal tuning for privacy-aware VLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces compact benchmarks PrivBench and PrivBench-H

Develops PrivTune for privacy-sensitive instruction-tuning

Achieves strong privacy-awareness with minimal fine-tuning data

🔎 Similar Papers

No similar papers found.