🤖 AI Summary
Current AI auditing tools predominantly focus on model performance evaluation, failing to support end-to-end accountability practices—including harm identification, evidence construction, stakeholder engagement, and intervention advocacy.
Method: We conducted in-depth interviews with 35 practitioners and systematically crawled, cataloged, and coded 435 auditing tools to develop a需求–capability mapping framework and an ecosystem gap diagnostic model.
Contribution/Results: Our analysis reveals systematic deficiencies across four critical capabilities: traceability, multi-stakeholder participation, evidentiary chain generation, and intervention support—constituting the first empirical diagnosis of such gaps. Building on these findings, we propose the “AI Accountability Infrastructure” paradigm, shifting beyond narrow assessment-centric design toward cross-stage coordination and multi-role adaptability. The study delivers an empirically grounded, prioritized roadmap for designing next-generation, accountability-oriented AI auditing tools.
📝 Abstract
Audits are critical mechanisms for identifying the risks and limitations of deployed artificial intelligence (AI) systems. However, the effective execution of AI audits remains incredibly difficult, and practitioners often need to make use of various tools to support their efforts. Drawing on interviews with 35 AI audit practitioners and a landscape analysis of 435 tools, we compare the current ecosystem of AI audit tooling to practitioner needs. While many tools are designed to help set standards and evaluate AI systems, they often fall short in supporting accountability. We outline challenges practitioners faced in their efforts to use AI audit tools and highlight areas for future tool development beyond evaluation -- from harms discovery to advocacy. We conclude that the available resources do not currently support the full scope of AI audit practitioners' needs and recommend that the field move beyond tools for just evaluation and towards more comprehensive infrastructure for AI accountability.