Task-driven single-image super-resolution reconstruction of document scans

📅 2024-07-12
🏛️ Conference on Computer Science and Information Systems
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low OCR accuracy on low-resolution document scan images, this paper proposes a task-driven single-image super-resolution (SR) method that explicitly incorporates text detection priors into the SR model training. The core contribution is a novel OCR-oriented multi-task loss function that jointly optimizes text structural fidelity—enforcing edge preservation and character contour integrity—and perceptual image quality—measured via pixel-level and feature-level similarity. By aligning SR reconstruction with downstream OCR requirements, the method mitigates the ill-posedness of conventional SR, which often degrades recognition robustness. Experiments on real-world document images demonstrate significant OCR accuracy improvements, especially under severe resolution constraints. Results validate that task-aware super-resolution enhances both effectiveness and practicality for intelligent document analysis.

Technology Category

Application Category

📝 Abstract
Super-resolution reconstruction is aimed at generating images of high spatial resolution from low-resolution observations. State-of-the-art super-resolution techniques underpinned with deep learning allow for obtaining results of outstanding visual quality, but it is seldom verified whether they constitute a valuable source for specific computer vision applications. In this paper, we investigate the possibility of employing super-resolution as a preprocessing step to improve optical character recognition from document scans. To achieve that, we propose to train deep networks for single-image super-resolution in a task-driven way to make them better adapted for the purpose of text detection. As problems limited to a specific task are heavily ill-posed, we introduce a multi-task loss function that embraces components related with text detection coupled with those guided by image similarity. The obtained results reported in this paper are encouraging and they constitute an important step towards real-world super-resolution of document images.
Problem

Research questions and friction points this paper is trying to address.

Improve optical character recognition from document scans
Train deep networks for task-driven super-resolution
Introduce multi-task loss for text detection and image similarity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-driven deep learning for super-resolution
Multi-task loss function for text detection
Super-resolution preprocessing for OCR improvement
🔎 Similar Papers
No similar papers found.
M
Maciej Zyrek
Department of Algorithmics and Software, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
M
M. Kawulok
Department of Algorithmics and Software, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland