🤖 AI Summary
To address the neglect of document hierarchical structure and semantic alignment between text and figures in scientific poster generation, this paper proposes a training-free framework. First, it constructs a hierarchical “poster tree” as an intermediate representation to explicitly model document structure and cross-modal semantic relationships. Second, it introduces a multi-agent collaboration mechanism to jointly optimize content summarization and layout planning. The key contribution lies in being the first to unify content selection, organization, and visual layout—under zero-shot conditions—via rule-driven graph-structured modeling and a multi-agent feedback loop. Experiments on a multi-disciplinary dataset demonstrate that the method significantly outperforms existing baselines, achieving performance closest to expert-designed posters across three metrics: information completeness, structural clarity, and user preference.
📝 Abstract
We present a novel training-free framework, extit{PosterForest}, for automated scientific poster generation. Unlike prior approaches, which largely neglect the hierarchical structure of scientific documents and the semantic integration of textual and visual elements, our method addresses both challenges directly. We introduce the extit{Poster Tree}, a hierarchical intermediate representation that jointly encodes document structure and visual-textual relationships at multiple levels. Our framework employs a multi-agent collaboration strategy, where agents specializing in content summarization and layout planning iteratively coordinate and provide mutual feedback. This approach enables the joint optimization of logical consistency, content fidelity, and visual coherence. Extensive experiments on multiple academic domains show that our method outperforms existing baselines in both qualitative and quantitative evaluations. The resulting posters achieve quality closest to expert-designed ground truth and deliver superior information preservation, structural clarity, and user preference.