🤖 AI Summary
Existing presentation generation systems are constrained by rigid templates and predefined workflows, limiting their ability to flexibly interpret user intent and iteratively improve outputs. This work proposes DeepPresenter, an environment-aware agent framework that autonomously plans, renders, and refines slides through an environment-anchored reflection mechanism—conditioning generation on actual rendered slide states rather than internal reasoning traces—to dynamically identify and correct presentation-specific issues. By integrating an agent-based architecture, environmental feedback loops, explicit slide state modeling, and a fine-tuned 9-billion-parameter language model, DeepPresenter achieves state-of-the-art performance across diverse generation scenarios while significantly reducing computational overhead without compromising output quality.
📝 Abstract
Presentation generation requires deep content research, coherent visual design, and iterative refinement based on observation. However, existing presentation agents often rely on predefined workflows and fixed templates. To address this, we present DeepPresenter, an agentic framework that adapts to diverse user intents, enables effective feedback-driven refinement, and generalizes beyond a scripted pipeline. Specifically, DeepPresenter autonomously plans, renders, and revises intermediate slide artifacts to support long-horizon refinement with environmental observations. Furthermore, rather than relying on self-reflection over internal signals (e.g., reasoning traces), our environment-grounded reflection conditions the generation process on perceptual artifact states (e.g., rendered slides), enabling the system to identify and correct presentation-specific issues during execution. Results on the evaluation set covering diverse presentation-generation scenarios show that DeepPresenter achieves state-of-the-art performance, and the fine-tuned 9B model remains highly competitive at substantially lower cost. Our project is available at: https://github.com/icip-cas/PPTAgent