🤖 AI Summary
Generative text-to-image (T2I) models pose systemic risks—including bias, offensiveness, and misinformation—yet existing auditing methods suffer from poor scalability, insufficient structural exploration, and inadequate coverage of auditing criteria. To address this, we propose the first scalable, systematic auditing framework for T2I models, integrating scene graph parsing, multimodal visual prompt design, LLM-driven intelligent recommendation of unexplored risk directions, and interactive visual analytics. Our framework supports hierarchical modeling and dynamic extension of auditing criteria, enabling structured, reproducible risk exploration. A user study with professional auditors demonstrates that our approach significantly improves organizational efficiency (+42%) and exploration breadth (+58%) across diverse risk criteria, while enhancing the systematicity, interpretability, and reproducibility of the auditing process.
📝 Abstract
Generative text-to-image (T2I) models are known for their risks related such as bias, offense, and misinformation. Current AI auditing methods face challenges in scalability and thoroughness, and it is even more challenging to enable auditors to explore the auditing space in a structural and effective way. Vipera employs multiple visual cues including a scene graph to facilitate image collection sensemaking and inspire auditors to explore and hierarchically organize the auditing criteria. Additionally, it leverages LLM-powered suggestions to facilitate exploration of unexplored auditing directions. An observational user study demonstrates Vipera's effectiveness in helping auditors organize their analyses while engaging with diverse criteria.