Few-Shot Connectivity-Aware Text Line Segmentation in Historical Documents

πŸ“… 2025-08-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Historical document text-line segmentation suffers from a severe scarcity of annotated training data due to high expert annotation costs and limited availability of labeled manuscripts. To address this, we propose a lightweight UNet++ architecture augmented with a neuron morphology-inspired connectivity-aware loss function, enabling precise modeling of text-line topology under an extremely low-data regimeβ€”just three annotated pages per manuscript. Our method employs patch-based training and aggressive data augmentation to enhance generalization. Evaluated on the U-DIADS-TL dataset, it achieves a 200% improvement in recognition accuracy, a 75% increase in line-level Intersection-over-Union (IoU), and an F-measure competitive with top-performing systems in the DIVA-HisDB competition. The core contribution is the first integration of connectivity-aware loss into few-shot text-line segmentation, yielding an end-to-end solution that attains high accuracy while drastically reducing annotation dependency.

Technology Category

Application Category

πŸ“ Abstract
A foundational task for the digital analysis of documents is text line segmentation. However, automating this process with deep learning models is challenging because it requires large, annotated datasets that are often unavailable for historical documents. Additionally, the annotation process is a labor- and cost-intensive task that requires expert knowledge, which makes few-shot learning a promising direction for reducing data requirements. In this work, we demonstrate that small and simple architectures, coupled with a topology-aware loss function, are more accurate and data-efficient than more complex alternatives. We pair a lightweight UNet++ with a connectivity-aware loss, initially developed for neuron morphology, which explicitly penalizes structural errors like line fragmentation and unintended line merges. To increase our limited data, we train on small patches extracted from a mere three annotated pages per manuscript. Our methodology significantly improves upon the current state-of-the-art on the U-DIADS-TL dataset, with a 200% increase in Recognition Accuracy and a 75% increase in Line Intersection over Union. Our method also achieves an F-Measure score on par with or even exceeding that of the competition winner of the DIVA-HisDB baseline detection task, all while requiring only three annotated pages, exemplifying the efficacy of our approach. Our implementation is publicly available at: https://github.com/RafaelSterzinger/acpr_few_shot_hist.
Problem

Research questions and friction points this paper is trying to address.

Automating text line segmentation in historical documents
Reducing annotation requirements through few-shot learning
Preventing structural errors like line fragmentation and merges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight UNet++ architecture
Connectivity-aware loss function
Training on small patches
πŸ”Ž Similar Papers
No similar papers found.
R
Rafael Sterzinger
Computer Vision Lab, TU Wien, Vienna, AUT
T
Tingyu Lin
Computer Vision Lab, TU Wien, Vienna, AUT
Robert Sablatnig
Robert Sablatnig
Prof. for Computer Vision, TU Wien
Document AnalysisDeep LearningMultispectral Image Analysis3D VisionComputer Vision for Cultural Heritage