MIRAGE: Multi-model Interface for Reviewing and Auditing Generative Text-to-Image AI

📅 2025-03-25

📈 Citations: 2

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Harmful outputs from generative AI undermine its trustworthiness and societal deployment. To address this, we propose a public-facing, multi-model collaborative auditing framework implemented as a web platform that enables non-expert users to concurrently evaluate outputs from multiple text-to-image models, leveraging their personal identities and lived experiences to detect implicit biases and harmful content. Our contribution lies in the first design of a comparative multi-model auditing interface and a structured feedback mechanism, integrating heterogeneous model APIs, interactive visual comparison modules, and context-aware auditing forms. A preliminary user study (n=5) demonstrates that our approach significantly improves detection rates of harmful biases—particularly along gender, racial, and situational stereotyping dimensions. This work establishes a novel paradigm for democratized, interpretable, and participatory evaluation of generative AI systems.

Technology Category

Application Category

📝 Abstract

While generative AI systems have gained popularity in diverse applications, their potential to produce harmful outputs limits their trustworthiness and usability in different applications. Recent years have seen growing interest in engaging diverse AI users in auditing generative AI that might impact their lives. To this end, we propose MIRAGE as a web-based tool where AI users can compare outputs from multiple AI text-to-image (T2I) models by auditing AI-generated images, and report their findings in a structured way. We used MIRAGE to conduct a preliminary user study with five participants and found that MIRAGE users could leverage their own lived experiences and identities to surface previously unnoticed details around harmful biases when reviewing multiple T2I models' outputs compared to reviewing only one.

Problem

Research questions and friction points this paper is trying to address.

Auditing harmful outputs in generative text-to-image AI

Comparing multiple AI models for bias detection

Enabling user-reported structured feedback on AI-generated images

Innovation

Methods, ideas, or system contributions that make the work stand out.

Web-based multi-model T2I comparison tool

Structured reporting for AI-generated image audits

Leveraging user identities to uncover biases

🔎 Similar Papers

Regeneration Based Training-free Attribution of Fake Images Generated by Text-to-Image Generative Models