🤖 AI Summary
This work addresses the underexplored challenge of generating long-form, technically rigorous patent descriptions using large language models (LLMs). We introduce PAP2PAT—the first open-source, real-world patent generation benchmark comprising 1.8k patent–paper pairs—designed for outline-guided, chunked generation from academic papers as invention disclosures. Our method proposes an outline-guided chunk generation paradigm, integrating chunked prompt engineering, zero-shot and fine-tuned generation, and multi-dimensional automatic evaluation complemented by human analysis. Experiments uncover a fundamental trade-off in LLMs between patent detail fidelity and factual hallucination: fine-tuning improves stylistic adaptation to patent conventions but exacerbates factual hallucinations. We fully open-source the benchmark data, implementation code, and evaluation framework, establishing a reproducible foundation for research on intelligent patent generation.
📝 Abstract
Dealing with long and highly complex technical text is a challenge for Large Language Models (LLMs), which still have to unfold their potential in supporting expensive and timeintensive processes like patent drafting. Within patents, the description constitutes more than 90% of the document on average. Yet, its automatic generation remains understudied. When drafting patent applications, patent attorneys typically receive invention reports (IRs), which are usually confidential, hindering research on LLM-supported patent drafting. Often, prepublication research papers serve as IRs. We leverage this duality to build PAP2PAT, an open and realistic benchmark for patent drafting consisting of 1.8k patent-paper pairs describing the same inventions. To address the complex longdocument patent generation task, we propose chunk-based outline-guided generation using the research paper as invention specification. Our extensive evaluation using PAP2PAT and a human case study show that LLMs can effectively leverage information from the paper, but still struggle to provide the necessary level of detail. Fine-tuning leads to more patent-style language, but also to more hallucination. We release our data and code https://github.com/boschresearch/Pap2Pat.