🤖 AI Summary
Existing benchmarks for contextual integrity (CI) assessment primarily focus on textual modalities and struggle to evaluate multimodal language agents’ performance in balancing privacy–utility trade-offs and adhering to social norms. To address this gap, this work proposes MPCI-Bench, the first evaluation benchmark for multimodal pairwise contextual integrity. It leverages positive–negative sample pairs derived from the same visual source to systematically assess agent privacy behaviors across three levels: normative judgment (Seed), contextual reasoning (Story), and executable actions (Traces). A three-principle iterative refinement process ensures high data quality. Evaluations using MPCI-Bench reveal that prevailing multimodal models commonly exhibit privacy–utility imbalances, with visual modalities posing significantly higher leakage risks than textual ones. The benchmark will be open-sourced to advance research on agent-based CI.
📝 Abstract
As language-model agents evolve from passive chatbots into proactive assistants that handle personal data, evaluating their adherence to social norms becomes increasingly critical, often through the lens of Contextual Integrity (CI). However, existing CI benchmarks are largely text-centric and primarily emphasize negative refusal scenarios, overlooking multimodal privacy risks and the fundamental trade-off between privacy and utility. In this paper, we introduce MPCI-Bench, the first Multimodal Pairwise Contextual Integrity benchmark for evaluating privacy behavior in agentic settings. MPCI-Bench consists of paired positive and negative instances derived from the same visual source and instantiated across three tiers: normative Seed judgments, context-rich Story reasoning, and executable agent action Traces. Data quality is ensured through a Tri-Principle Iterative Refinement pipeline. Evaluations of state-of-the-art multimodal models reveal systematic failures to balance privacy and utility and a pronounced modality leakage gap, where sensitive visual information is leaked more frequently than textual information. We will open-source MPCI-Bench to facilitate future research on agentic CI.