S2Doc - Spatial-Semantic Document Format

📅 2025-11-02

📈 Citations: 0

✨ Influential: 0

career value

145K/year

🤖 AI Summary

Current document and table modeling lacks a unified standard, with existing approaches typically modeling spatial or semantic structures in isolation—leading to format incompatibility and poor cross-task generalization. To address this, we propose S2Doc: a standardized document data model that jointly encodes spatial layout and semantic hierarchy. S2Doc employs a hierarchical structure that integrates coordinate information with semantic labels, enabling unified, multi-page document representation while natively supporting interoperable formats such as JSON. It is the first model to concurrently represent both spatial and semantic dimensions within a single, standardized structure—thereby filling a critical gap in document modeling standards. Empirical evaluation demonstrates that S2Doc significantly improves cross-task compatibility, model interoperability, and data exchange efficiency. It has been validated across core document understanding tasks, including OCR, information extraction, and table recognition.

Technology Category

Application Category

📝 Abstract

Documents are a common way to store and share information, with tables being an important part of many documents. However, there is no real common understanding of how to model documents and tables in particular. Because of this lack of standardization, most scientific approaches have their own way of modeling documents and tables, leading to a variety of different data structures and formats that are not directly compatible. Furthermore, most data models focus on either the spatial or the semantic structure of a document, neglecting the other aspect. To address this, we developed S2Doc, a flexible data structure for modeling documents and tables that combines both spatial and semantic information in a single format. It is designed to be easily extendable to new tasks and supports most modeling approaches for documents and tables, including multi-page documents. To the best of our knowledge, it is the first approach of its kind to combine all these aspects in a single format.

Problem

Research questions and friction points this paper is trying to address.

Lack of standardized document and table modeling approaches

Incompatibility between spatial and semantic document structures

Need for unified format combining spatial-semantic information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines spatial and semantic document information

Provides flexible data structure for tables

Supports multi-page documents and extendability

🔎 Similar Papers

No similar papers found.