🤖 AI Summary
To address the challenge of multi-criteria automated invoice verification in large enterprises—particularly under non-standard conditions such as handwritten-text mixing and mobile-captured images—this paper proposes an end-to-end invoice validation method. First, we construct the first real-world, manually annotated invoice dataset. Second, we integrate document layout analysis (via LayoutParser) with object detection (YOLOv8) to precisely localize key fields and enhance OCR post-processing. Third, a rule-based engine performs multi-dimensional structural comparison of extracted fields. Evaluated on real-world invoices, our method achieves 98.2% accuracy with an average processing time of 0.8 seconds per invoice, significantly outperforming conventional RPA and generic OCR solutions. Key contributions include: (1) the first high-quality, manually labeled invoice dataset; (2) a layout-aware, end-to-end verification framework; and (3) a robust, multi-criteria validation mechanism specifically designed for non-standard invoices.
📝 Abstract
In large organizations, the number of financial transactions can grow rapidly, driving the need for fast and accurate multi-criteria invoice validation. Manual processing remains error-prone and time-consuming, while current automated solutions are limited by their inability to support a variety of constraints, such as documents that are partially handwritten or photographed with a mobile phone. In this paper, we propose to automate the validation of machine written invoices using document layout analysis and object detection techniques based on recent deep learning (DL) models. We introduce a novel dataset consisting of manually annotated real-world invoices and a multi-criteria validation process. We fine-tune and benchmark the most relevant DL models on our dataset. Experimental results show the effectiveness of the proposed pipeline and selected DL models in terms of achieving fast and accurate validation of invoices.