AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval

📅 2025-06-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of human-annotated data for slide understanding—which severely limits the performance of element detection and cross-modal retrieval models—this paper proposes an LLM-guided synthetic slide generation framework. Leveraging large language models, we generate semantically coherent and structurally sound slide text content, then render it into high-fidelity, photorealistic slides using customizable templates. Based on this pipeline, we construct RealSlide, the first benchmark comprising real-world slides with fine-grained annotations for rigorous evaluation. We systematically validate the efficacy of synthetic data in few-shot transfer learning. Experiments demonstrate that pretraining on our synthetic dataset significantly improves few-shot performance on real slides: mAP increases by 12.3% for object detection and Recall@10 rises by 9.7% for cross-modal retrieval. The codebase, synthetic dataset, and RealSlide benchmark are fully open-sourced.

Technology Category

Application Category

📝 Abstract
Lecture slide element detection and retrieval are key problems in slide understanding. Training effective models for these tasks often depends on extensive manual annotation. However, annotating large volumes of lecture slides for supervised training is labor intensive and requires domain expertise. To address this, we propose a large language model (LLM)-guided synthetic lecture slide generation pipeline, SynLecSlideGen, which produces high-quality, coherent and realistic slides. We also create an evaluation benchmark, namely RealSlide by manually annotating 1,050 real lecture slides. To assess the utility of our synthetic slides, we perform few-shot transfer learning on real data using models pre-trained on them. Experimental results show that few-shot transfer learning with pretraining on synthetic slides significantly improves performance compared to training only on real data. This demonstrates that synthetic data can effectively compensate for limited labeled lecture slides. The code and resources of our work are publicly available on our project website: https://synslidegen.github.io/.
Problem

Research questions and friction points this paper is trying to address.

Detecting and retrieving lecture slide elements efficiently
Reducing manual annotation for slide understanding models
Generating synthetic slides to compensate limited labeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided synthetic slide generation pipeline
Few-shot transfer learning with synthetic data
Public benchmark RealSlide for evaluation
🔎 Similar Papers
No similar papers found.
S
Suyash Maniyar
Indian Institute of Technology, Jodhpur, India
V
Vishvesh Trivedi
Sardar Vallabhbhai National Institute of Technology, Surat, India
A
Ajoy Mondal
CVIT, International Institute of Information Technology, Hyderabad, India
Anand Mishra
Anand Mishra
IIT Jodhpur
Computer VisionMachine Learning
C. V. Jawahar
C. V. Jawahar
CVIT, IIIT Hyderabad, India
Computer Vision