🤖 AI Summary
Addressing core challenges in complex mathematical formula recognition—namely, difficulty in modeling semantic ambiguity and high error rates in fine-grained structural parsing—this paper proposes HDNet, a novel hierarchical deep network, alongside HDR, the first hierarchical fine-grained benchmark dataset. HDR comprises 100 million high-quality training samples and multiple interpretability-oriented test subsets, supporting multi-level semantic and structural annotations. HDNet introduces four key innovations: hierarchical sub-formula modeling, detail-focused attention, structure-aware sequential decoding, and multi-interpretability joint learning. Evaluated on HDR-Test as well as established benchmarks (CROHME, IM2LaTeX), HDNet achieves significant improvements in structural parsing accuracy and cross-domain robustness. Notably, it is the first method to enable interpretable, fine-grained recognition of ambiguous mathematical formulas, thereby advancing both recognition fidelity and explainability in mathematical OCR.
📝 Abstract
Hierarchical and complex Mathematical Expression Recognition (MER) is challenging due to multiple possible interpretations of a formula, complicating both parsing and evaluation. In this paper, we introduce the Hierarchical Detail-Focused Recognition dataset (HDR), the first dataset specifically designed to address these issues. It consists of a large-scale training set, HDR-100M, offering an unprecedented scale and diversity with one hundred million training instances. And the test set, HDR-Test, includes multiple interpretations of complex hierarchical formulas for comprehensive model performance evaluation. Additionally, the parsing of complex formulas often suffers from errors in fine-grained details. To address this, we propose the Hierarchical Detail-Focused Recognition Network (HDNet), an innovative framework that incorporates a hierarchical sub-formula module, focusing on the precise handling of formula details, thereby significantly enhancing MER performance. Experimental results demonstrate that HDNet outperforms existing MER models across various datasets.