Single Document Image Highlight Removal via A Large-Scale Real-World Dataset and A Location-Aware Network

πŸ“… 2025-04-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the severe degradation of text readability in reflective document images under ambient illumination caused by specular highlights, this paper introduces DocHR14Kβ€”the first large-scale, high-resolution, real-scenario dataset of specular highlight-contaminated documents (14,902 image pairs)β€”and proposes L2HRNet, a network leveraging Highlight Location Prior (HLP). Methodologically, it pioneers a multi-category, multi-illumination real-world highlight document dataset; devises an unsupervised mechanism for highlight location prior estimation; and designs a Laplacian pyramid architecture integrating location-guided feature modulation and diffusion-based enhancement. On DocHR14K, L2HRNet achieves +5.01 dB PSNR gain and βˆ’13.17% RMSE reduction over prior methods. It also attains state-of-the-art performance across three public benchmarks, significantly advancing the practical deployment of specular highlight removal in document imaging.

Technology Category

Application Category

πŸ“ Abstract
Reflective documents often suffer from specular highlights under ambient lighting, severely hindering text readability and degrading overall visual quality. Although recent deep learning methods show promise in highlight removal, they remain suboptimal for document images, primarily due to the lack of dedicated datasets and tailored architectural designs. To tackle these challenges, we present DocHR14K, a large-scale real-world dataset comprising 14,902 high-resolution image pairs across six document categories and various lighting conditions. To the best of our knowledge, this is the first high-resolution dataset for document highlight removal that captures a wide range of real-world lighting conditions. Additionally, motivated by the observation that the residual map between highlighted and clean images naturally reveals the spatial structure of highlight regions, we propose a simple yet effective Highlight Location Prior (HLP) to estimate highlight masks without human annotations. Building on this prior, we present the Location-Aware Laplacian Pyramid Highlight Removal Network (L2HRNet), which effectively removes highlights by leveraging estimated priors and incorporates diffusion module to restore details. Extensive experiments demonstrate that DocHR14K improves highlight removal under diverse lighting conditions. Our L2HRNet achieves state-of-the-art performance across three benchmark datasets, including a 5.01% increase in PSNR and a 13.17% reduction in RMSE on DocHR14K.
Problem

Research questions and friction points this paper is trying to address.

Removing specular highlights from reflective document images
Lack of dedicated datasets for document highlight removal
Improving text readability and visual quality under various lighting
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale real-world dataset DocHR14K
Highlight Location Prior without annotations
Location-Aware Laplacian Pyramid Network L2HRNet
πŸ”Ž Similar Papers
No similar papers found.