OmniGuard: Unified Omni-Modal Guardrails with Deliberate Reasoning

📅 2025-12-02

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Existing binary-classification-based unimodal safety mitigation methods exhibit insufficient robustness for Omni-modal Large Language Models (OLLMs) facing security alignment challenges across diverse modalities—text, image, video, and audio. Method: We propose the first omni-modal safety framework enabling deliberative reasoning. It introduces a cross-modal unified safety representation system, incorporating structured safety labels and expert-model-distilled safety annotations to enable fine-grained, interpretable risk identification. The framework integrates multimodal classification, cross-modal alignment, and targeted knowledge distillation, trained on a large-scale omni-modal safety dataset. Contribution/Results: Extensive evaluation across 15 benchmarks demonstrates significant improvements in safety mitigation’s unity, generalizability, and reliability—marking the first framework to holistically address omni-modal safety alignment with interpretable, reasoning-aware mechanisms.

Technology Category

Application Category

📝 Abstract

Omni-modal Large Language Models (OLLMs) that process text, images, videos, and audio introduce new challenges for safety and value guardrails in human-AI interaction. Prior guardrail research largely targets unimodal settings and typically frames safeguarding as binary classification, which limits robustness across diverse modalities and tasks. To address this gap, we propose OmniGuard, the first family of omni-modal guardrails that performs safeguarding across all modalities with deliberate reasoning ability. To support the training of OMNIGUARD, we curate a large, comprehensive omni-modal safety dataset comprising over 210K diverse samples, with inputs that cover all modalities through both unimodal and cross-modal samples. Each sample is annotated with structured safety labels and carefully curated safety critiques from expert models through targeted distillation. Extensive experiments on 15 benchmarks show that OmniGuard achieves strong effectiveness and generalization across a wide range of multimodal safety scenarios. Importantly, OmniGuard provides a unified framework that enforces policies and mitigates risks in omni-modalities, paving the way toward building more robust and capable omnimodal safeguarding systems.

Problem

Research questions and friction points this paper is trying to address.

Develops omni-modal safety guardrails for multimodal AI

Addresses limitations of unimodal binary classification approaches

Enables robust safeguarding across diverse modalities and tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified omni-modal guardrails with deliberate reasoning

Large curated dataset with structured safety labels

Generalization across diverse multimodal safety scenarios

🔎 Similar Papers

Swiss Cheese Model for AI Safety: A Taxonomy and Reference Architecture for Multi-Layered Guardrails of Foundation Model Based Agents