DeepScribe: Localization and Classification of Elamite Cuneiform Signs Via Deep Learning

📅 2023-06-02
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
This study addresses the challenge of automatic decipherment of Achaemenid-era Elamite cuneiform tablets. We propose the first end-to-end visual analysis framework specifically designed for this script. Methodologically, we construct a modular computer vision pipeline: RetinaNet is employed for symbol-level precise localization; ResNet enables fine-grained multi-class classification; and, innovatively, stroke morphology-based unsupervised clustering uncovers structural regularities beyond conventional printed sign lists. Joint optimization further enhances synergy between localization and classification. Experiments demonstrate a mean Average Precision (mAP) of 0.78 for symbol localization, a top-5 classification accuracy of 0.89 per symbol, and an end-to-end top-5 transcription accuracy of 0.80. Evaluated on authentic clay tablet images, the system delivers high-confidence transliteration suggestions, substantially improving both efficiency and reliability in archaeological text processing.
📝 Abstract
Twenty-five hundred years ago, the paperwork of the Achaemenid Empire was recorded on clay tablets. In 1933, archaeologists from the University of Chicago's Oriental Institute (OI) found tens of thousands of these tablets and fragments during the excavation of Persepolis. Many of these tablets have been painstakingly photographed and annotated by expert cuneiformists, and now provide a rich dataset consisting of over 5,000 annotated tablet images and 100,000 cuneiform sign bounding boxes. We leverage this dataset to develop DeepScribe, a modular computer vision pipeline capable of localizing cuneiform signs and providing suggestions for the identity of each sign. We investigate the difficulty of learning subtasks relevant to cuneiform tablet transcription on ground-truth data, finding that a RetinaNet object detector can achieve a localization mAP of 0.78 and a ResNet classifier can achieve a top-5 sign classification accuracy of 0.89. The end-to-end pipeline achieves a top-5 classification accuracy of 0.80. As part of the classification module, DeepScribe groups cuneiform signs into morphological clusters. We consider how this automatic clustering approach differs from the organization of standard, printed sign lists and what we may learn from it. These components, trained individually, are sufficient to produce a system that can analyze photos of cuneiform tablets from the Achaemenid period and provide useful transliteration suggestions to researchers. We evaluate the model's end-to-end performance on locating and classifying signs, providing a roadmap to a linguistically-aware transliteration system, then consider the model's potential utility when applied to other periods of cuneiform writing.
Problem

Research questions and friction points this paper is trying to address.

Automated Recognition
Cuneiform Script
Archaeological Analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

DeepScribe
RetinaNet
ResNet
🔎 Similar Papers
No similar papers found.
E
Edward C. Williams
Independent Researcher, USA
Grace Su
Grace Su
Carnegie Mellon University
Computer Vision
S
Sandra R. Schloen
Digital Studies, University of Chicago, USA
M
Miller C. Prosser
Digital Studies, University of Chicago, USA
S
Susanne Paulus
Oriental Institute, University of Chicago, USA
Sanjay Krishnan
Sanjay Krishnan
University of Chicago
DatabasesMachine Learning