PaperVoyager : Building Interactive Web with Visual Language Models

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing document agents struggle to capture the dynamic mechanisms and state transitions inherent in technical papers. This work proposes PaperVoyager, the first end-to-end Paper-to-Interactive-System agent that automatically transforms PDF-based research papers into interactive web systems by integrating vision-language models, semantic understanding of scholarly content, and explicit modeling of interaction logic. The resulting systems enable user input and real-time observation of dynamic behaviors. To evaluate this approach, we introduce a new benchmark comprising 19 scientific papers and develop a structured generation framework that explicitly represents underlying mechanisms and interactive protocols. Experimental results demonstrate that PaperVoyager significantly improves generation quality, establishing a novel paradigm for interactive comprehension of scientific literature.

Technology Category

Application Category

📝 Abstract
Recent advances in visual language models have enabled autonomous agents for complex reasoning, tool use, and document understanding. However, existing document agents mainly transform papers into static artifacts such as summaries, webpages, or slides, which are insufficient for technical papers involving dynamic mechanisms and state transitions. In this work, we propose a Paper-to-Interactive-System Agent that converts research papers into executable interactive web systems. Given a PDF paper, the agent performs end-to-end processing without human intervention, including paper understanding, system modeling, and interactive webpage synthesis, enabling users to manipulate inputs and observe dynamic behaviors. To evaluate this task, we introduce a benchmark of 19 research papers paired with expert-built interactive systems as ground truth. We further propose PaperVoyager, a structured generation framework that explicitly models mechanisms and interaction logic during synthesis. Experiments show that PaperVoyager significantly improves the quality of generated interactive systems, offering a new paradigm for interactive scientific paper understanding.
Problem

Research questions and friction points this paper is trying to address.

interactive systems
scientific papers
dynamic mechanisms
visual language models
document understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Visual Language Models
Interactive Web Systems
Paper-to-Code
Autonomous Agents
Structured Generation
🔎 Similar Papers
No similar papers found.