DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing presentation generation systems are constrained by rigid templates and predefined workflows, limiting their ability to flexibly interpret user intent and iteratively improve outputs. This work proposes DeepPresenter, an environment-aware agent framework that autonomously plans, renders, and refines slides through an environment-anchored reflection mechanism—conditioning generation on actual rendered slide states rather than internal reasoning traces—to dynamically identify and correct presentation-specific issues. By integrating an agent-based architecture, environmental feedback loops, explicit slide state modeling, and a fine-tuned 9-billion-parameter language model, DeepPresenter achieves state-of-the-art performance across diverse generation scenarios while significantly reducing computational overhead without compromising output quality.

Technology Category

Application Category

📝 Abstract
Presentation generation requires deep content research, coherent visual design, and iterative refinement based on observation. However, existing presentation agents often rely on predefined workflows and fixed templates. To address this, we present DeepPresenter, an agentic framework that adapts to diverse user intents, enables effective feedback-driven refinement, and generalizes beyond a scripted pipeline. Specifically, DeepPresenter autonomously plans, renders, and revises intermediate slide artifacts to support long-horizon refinement with environmental observations. Furthermore, rather than relying on self-reflection over internal signals (e.g., reasoning traces), our environment-grounded reflection conditions the generation process on perceptual artifact states (e.g., rendered slides), enabling the system to identify and correct presentation-specific issues during execution. Results on the evaluation set covering diverse presentation-generation scenarios show that DeepPresenter achieves state-of-the-art performance, and the fine-tuned 9B model remains highly competitive at substantially lower cost. Our project is available at: https://github.com/icip-cas/PPTAgent
Problem

Research questions and friction points this paper is trying to address.

presentation generation
agentic framework
environment-grounded reflection
feedback-driven refinement
visual design
Innovation

Methods, ideas, or system contributions that make the work stand out.

environment-grounded reflection
agentic presentation generation
perceptual artifact states
autonomous slide refinement
feedback-driven adaptation
🔎 Similar Papers
No similar papers found.
Hao Zheng
Hao Zheng
Master Student at University of Chinese Academy of Sciences
Large Language ModelsNatural Language Processing
G
Guozhao Mo
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
X
Xinru Yan
University of Chinese Academy of Sciences, Beijing, China
Q
Qianhao Yuan
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
Wenkai Zhang
Wenkai Zhang
Henan University
NanophotonicsPolymer
Xuanang Chen
Xuanang Chen
Institute of Software, Chinese Academy of Sciences
Information RetrievalNatural Language Processing
Yaojie Lu
Yaojie Lu
Institute of Software, Chinese Academy of Sciences
Information ExtractionLarge Language Models
Hongyu Lin
Hongyu Lin
Institute of Software, Chinese Academy of Sciences
Natural Language ProcessingInformation Extraction and Machine Learning
X
Xianpei Han
Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing, China
Le Sun
Le Sun
Institute of Software, CAS
information_retrievalnatural_language_processing