Paper2Web: Let's Make Your Paper Alive!

📅 2025-10-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Academic project websites often suffer from low dissemination efficacy, poor interactivity, and rigid layouts, while lacking a systematic evaluation framework tailored for academic web generation. To address this, we propose Paper2Web: the first multidimensional evaluation framework jointly assessing syntactic regularity, LLM-as-a-Judge discriminative quality, and knowledge fidelity—validated via the PaperQuiz benchmark. We further introduce PWAgent, an end-to-end intelligent agent that integrates large language models, the Model-Controller-Planner (MCP) toolchain, and an iterative content-layout co-optimization mechanism to automatically convert PDF papers into rich, interactive web pages. Experiments demonstrate that PWAgent significantly outperforms template-based approaches and arXiv’s native pages in interactivity, visual aesthetics, and information completeness, achieving both cost efficiency and Pareto-optimal performance.

Technology Category

Application Category

📝 Abstract
Academic project websites can more effectively disseminate research when they clearly present core content and enable intuitive navigation and interaction. However, current approaches such as direct Large Language Model (LLM) generation, templates, or direct HTML conversion struggle to produce layout-aware, interactive sites, and a comprehensive evaluation suite for this task has been lacking. In this paper, we introduce Paper2Web, a benchmark dataset and multi-dimensional evaluation framework for assessing academic webpage generation. It incorporates rule-based metrics like Connectivity, Completeness and human-verified LLM-as-a-Judge (covering interactivity, aesthetics, and informativeness), and PaperQuiz, which measures paper-level knowledge retention. We further present PWAgent, an autonomous pipeline that converts scientific papers into interactive and multimedia-rich academic homepages. The agent iteratively refines both content and layout through MCP tools that enhance emphasis, balance, and presentation quality. Our experiments show that PWAgent consistently outperforms end-to-end baselines like template-based webpages and arXiv/alphaXiv versions by a large margin while maintaining low cost, achieving the Pareto-front in academic webpage generation.
Problem

Research questions and friction points this paper is trying to address.

Generating interactive academic websites from scientific papers effectively
Evaluating webpage quality through multi-dimensional metrics and human verification
Automating content and layout refinement for enhanced presentation quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autonomous pipeline converts papers to interactive homepages
Iterative refinement of content and layout via MCP tools
Multidimensional evaluation framework with rule-based and LLM-judged metrics
🔎 Similar Papers
No similar papers found.
Y
Yuhang Chen
ONE Lab, Huazhong University of Science and Technology
T
Tianpeng Lv
ONE Lab, Huazhong University of Science and Technology
S
Siyi Zhang
ONE Lab, Huazhong University of Science and Technology
Y
Yixiang Yin
ONE Lab, Huazhong University of Science and Technology
Yao Wan
Yao Wan
Huazhong University of Science and Technology
NLPProgramming LanguagesSoftware EngineeringLarge Language Models
Philip S. Yu
Philip S. Yu
Professor of Computer Science, University of Illinons at Chicago
Data miningDatabasePrivacy
D
Dongping Chen
University of Maryland