SlideAudit: A Dataset and Taxonomy for Automated Evaluation of Presentation Slides

📅 2025-08-05

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This study addresses the automated assessment of slide design flaws. We propose the first fine-grained, systematic taxonomy of presentation design defects and construct a high-quality annotated dataset comprising 2,400 slides, integrating collaborative expert labeling with crowdsourced fine-grained annotations. Leveraging this taxonomy, we design prompt strategies and conduct experiments using multimodal large language models (MLLMs) integrated with existing design evaluation pipelines. Results show that current MLLMs exhibit limited defect detection capability (F1 = 0.331–0.655), yet our taxonomy-informed approach significantly outperforms baselines. After AI-driven optimization, 82.0% of slides exhibited substantial quality improvement, with 87.8% of enhancements directly attributable to our taxonomy-guided prompting. Our core contribution is a principled, interpretable, and scalable framework for slide design defect assessment—establishing a novel paradigm for AI-augmented visual communication design.

Technology Category

Application Category

📝 Abstract

Automated evaluation of specific graphic designs like presentation slides is an open problem. We present SlideAudit, a dataset for automated slide evaluation. We collaborated with design experts to develop a thorough taxonomy of slide design flaws. Our dataset comprises 2400 slides collected and synthesized from multiple sources, including a subset intentionally modified with specific design problems. We then fully annotated them using our taxonomy through strictly trained crowdsourcing from Prolific. To evaluate whether AI is capable of identifying design flaws, we compared multiple large language models under different prompting strategies, and with an existing design critique pipeline. We show that AI models struggle to accurately identify slide design flaws, with F1 scores ranging from 0.331 to 0.655. Notably, prompting techniques leveraging our taxonomy achieved the highest performance. We further conducted a remediation study to assess AI's potential for improving slides. Among 82.0% of slides that showed significant improvement, 87.8% of them were improved more with our taxonomy, further demonstrating its utility.

Problem

Research questions and friction points this paper is trying to address.

Automated evaluation of presentation slide design flaws

Developing a taxonomy for slide design flaw identification

Assessing AI models' capability to improve slide designs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed SlideAudit dataset for automated slide evaluation

Collaborated with experts to create design flaw taxonomy

Evaluated AI models using taxonomy-based prompting strategies

🔎 Similar Papers

No similar papers found.