UniSAFE: A Comprehensive Benchmark for Safety Evaluation of Unified Multimodal Models

📅 2026-03-18

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Existing safety evaluation benchmarks are highly fragmented across tasks and modalities, making it difficult to systematically assess the safety risks of unified multimodal models (UMMs). To address this gap, this work proposes UniSAFE—the first system-level safety benchmark tailored for UMMs—which innovatively employs a shared-objective design to unify risk scenarios across seven input–output modality combinations. The benchmark comprises 6,802 human-reviewed test instances and explicitly supports high-risk settings such as multi-image synthesis and multi-turn interactions. Evaluations of 15 prominent UMMs using this framework reveal that violation rates in image generation tasks are significantly higher than in text-based tasks, with safety risks markedly exacerbated in multi-image and multi-turn contexts.

Technology Category

Application Category

📝 Abstract

Unified Multimodal Models (UMMs) offer powerful cross-modality capabilities but introduce new safety risks not observed in single-task models. Despite their emergence, existing safety benchmarks remain fragmented across tasks and modalities, limiting the comprehensive evaluation of complex system-level vulnerabilities. To address this gap, we introduce UniSAFE, the first comprehensive benchmark for system-level safety evaluation of UMMs across 7 I/O modality combinations, spanning conventional tasks and novel multimodal-context image generation settings. UniSAFE is built with a shared-target design that projects common risk scenarios across task-specific I/O configurations, enabling controlled cross-task comparisons of safety failures. Comprising 6,802 curated instances, we use UniSAFE to evaluate 15 state-of-the-art UMMs, both proprietary and open-source. Our results reveal critical vulnerabilities across current UMMs, including elevated safety violations in multi-image composition and multi-turn settings, with image-output tasks consistently more vulnerable than text-output tasks. These findings highlight the need for stronger system-level safety alignment for UMMs. Our code and data are publicly available at https://github.com/segyulee/UniSAFE

Problem

Research questions and friction points this paper is trying to address.

Unified Multimodal Models

Safety Evaluation

Multimodal Safety

System-level Vulnerabilities

Cross-modality Risks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified Multimodal Models

Safety Benchmark

System-level Evaluation