🤖 AI Summary
Spreadsheets are widely used in business and finance, yet their operations lack systematic documentation, severely hindering reproducibility, collaboration, and knowledge transfer. To address this, we introduce Spreadsheet Operation Documentation (SOD), the first task dedicated to automatically generating natural-language descriptions of spreadsheet operations using large language models (LLMs). We construct the first high-quality benchmark dataset—SOD-Bench—comprising 111 annotated code snippets paired with human-written operational descriptions. We evaluate state-of-the-art LLMs (e.g., GPT-4o, LLaMA-3.3-70B) across multiple metrics (BLEU, ROUGE-L, METEOR, GLEU), demonstrating that LLMs can accurately capture spreadsheet operation semantics. Our results confirm the feasibility of SOD and establish a new paradigm for LLM-driven low-code office automation. This work fills a critical gap in non-code-generation documentation tasks within office automation research and provides both a foundational benchmark and methodological framework for future studies.
📝 Abstract
Numerous knowledge workers utilize spreadsheets in business, accounting, and finance. However, a lack of systematic documentation methods for spreadsheets hinders automation, collaboration, and knowledge transfer, which risks the loss of crucial institutional knowledge. This paper introduces Spreadsheet Operations Documentation (SOD), an AI task that involves generating human-readable explanations from spreadsheet operations. Many previous studies have utilized Large Language Models (LLMs) for generating spreadsheet manipulation code; however, translating that code into natural language for SOD is a less-explored area. To address this, we present a benchmark of 111 spreadsheet manipulation code snippets, each paired with a corresponding natural language summary. We evaluate five LLMs, GPT-4o, GPT-4o-mini, LLaMA-3.3-70B, Mixtral-8x7B, and Gemma2-9B, using BLEU, GLEU, ROUGE-L, and METEOR metrics. Our findings suggest that LLMs can generate accurate spreadsheet documentation, making SOD a feasible prerequisite step toward enhancing reproducibility, maintainability, and collaborative workflows in spreadsheets, although there are challenges that need to be addressed.