FailureAtlas:Mapping the Failure Landscape of T2I Models via Active Exploration

πŸ“… 2025-09-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Static benchmarking inadequately supports systematic diagnosis of failure roots in text-to-image (T2I) models. To address this, we propose an active exploration paradigm and introduce the first scalable, automated framework for discovering and mapping T2I failure modes. Our core methodological innovation lies in modeling failure discovery as a structured search for minimal faulty concepts, enabling diagnosis-prioritized analysis. We further integrate novel acceleration techniques with active exploration strategies to efficiently identify error slicesβ€”e.g., uncovering over 247,000 previously unknown failures in Stable Diffusion 1.5. Crucially, our framework provides the first large-scale empirical evidence linking training data scarcity to model failure, revealing a systematic correlation between data insufficiency and erroneous generations across diverse semantic concepts. This work establishes a foundation for data-aware, interpretable T2I model diagnosis and improvement.

Technology Category

Application Category

πŸ“ Abstract
Static benchmarks have provided a valuable foundation for comparing Text-to-Image (T2I) models. However, their passive design offers limited diagnostic power, struggling to uncover the full landscape of systematic failures or isolate their root causes. We argue for a complementary paradigm: active exploration. We introduce FailureAtlas, the first framework designed to autonomously explore and map the vast failure landscape of T2I models at scale. FailureAtlas frames error discovery as a structured search for minimal, failure-inducing concepts. While it is a computationally explosive problem, we make it tractable with novel acceleration techniques. When applied to Stable Diffusion models, our method uncovers hundreds of thousands of previously unknown error slices (over 247,000 in SD1.5 alone) and provides the first large-scale evidence linking these failures to data scarcity in the training set. By providing a principled and scalable engine for deep model auditing, FailureAtlas establishes a new, diagnostic-first methodology to guide the development of more robust generative AI. The code is available at https://github.com/cure-lab/FailureAtlas
Problem

Research questions and friction points this paper is trying to address.

Mapping systematic failures in Text-to-Image models
Identifying root causes through active exploration
Linking failures to training data scarcity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Active exploration framework for T2I failure analysis
Structured search for minimal failure-inducing concepts
Novel acceleration techniques for tractable error discovery
πŸ”Ž Similar Papers
No similar papers found.
M
Muxi Chen
The Chinese University of Hong Kong
Z
Zhaohua Zhang
Dalian University of Technology
C
Chenchen Zhao
The Chinese University of Hong Kong
Mingyang Chen
Mingyang Chen
Baichuan Inc., Zhejiang University, The University of Edinburgh
Large Language ModelReinforcement LearningKnowledge Graph
W
Wenyu Jiang
Nanjing University
Tianwen Jiang
Tianwen Jiang
Harbin Institute of Technology
Knowledge GraphInformation ExtractionNatural Language Processing
Jianhuan Zhuo
Jianhuan Zhuo
Institute of Information Engineering, Chinese Academy of Sciences
Representation LearningRecommendation System
Y
Yu Tang
Tencent
Q
Qiuyong Xiao
Tencent
J
Jihong Zhang
Tencent
Q
Qiang Xu
The Chinese University of Hong Kong