🤖 AI Summary
This work addresses the non-stationary learning challenges in ancient Chinese character recognition, including continuously expanding classes, scarce incremental data, and significant intra-class stylistic variations. To tackle these issues, the authors propose a continual learning framework based on anchored modular retrieval. The method constructs a shared multimodal embedding space and enables dynamic addition of new classes through an extensible multi-prototype dictionary. A script-conditioned injection module (SIA+SAR) is introduced to calibrate embedding consistency across learning stages. This study pioneers a modular retrieval paradigm tailored for continual Chinese character recognition and establishes EvoCON—a benchmark comprising six ancient scripts. The proposed approach substantially outperforms existing methods in both incremental recognition and zero-shot character handling.
📝 Abstract
Ancient Chinese character recognition is a core capability for cultural heritage digitization, yet real-world workflows are inherently non-stationary: newly excavated materials are continuously onboarded, bringing new classes in different scripts, and expanding the class space over time. We formalize this process as Continual Chinese Character Recognition (Continual CCR), a script-staged, class-incremental setting that couples two challenges: (i) scalable learning under continual class growth with subtle inter-class differences and scarce incremental data, and (ii) pronounced intra-class diversity caused by writing-style variations across writers and carrier conditions. To overcome the limitations of conventional closed-set classification, we propose AMR-CCR, an anchored modular retrieval framework that performs recognition via embedding-based dictionary matching in a shared multimodal space, allowing new classes to be added by simply extending the dictionary. AMR-CCR further introduces a lightweight script-conditioned injection module (SIA+SAR) to calibrate newly onboarded scripts while preserving cross-stage embedding compatibility, and an image-derived multi-prototype dictionary that clusters within-class embeddings to better cover diverse style modes. To support systematic evaluation, we build EvoCON, a six-stage benchmark for continual script onboarding, covering six scripts (OBC, BI, SS, SAC, WSC, CS), augmented with meaning/shape descriptions and an explicit zero-shot split for unseen characters without image exemplars.