Signs of the Past, Patterns of the Present: On the Automatic Classification of Old Babylonian Cuneiform Signs

📅 2025-07-18

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

High intra-class variability in cuneiform signs—arising from differences in provenance, scribes, and digitization methods—severely limits model generalizability across datasets. Method: We present the first automated classification study of handwritten cuneiform, employing ResNet50 trained on a multi-source dataset comprising clay tablet images from Nippur, Dur-Abi-ešuḫ, and Sippar. We systematically evaluate cross-site and cross-acquisition robustness. Results: On symbol classes with ≥20 samples, the model achieves 87.1% top-1 and 96.5% top-5 accuracy—the field’s first reproducible, comparable benchmark. Our analysis identifies data variability as a critical bottleneck for generalization and proposes standardized guidelines for cuneiform data acquisition and annotation. This work establishes a methodological foundation for cross-site cuneiform text recognition and digital humanities modeling.

Technology Category

Application Category

📝 Abstract

The work in this paper describes the training and evaluation of machine learning (ML) techniques for the classification of cuneiform signs. There is a lot of variability in cuneiform signs, depending on where they come from, for what and by whom they were written, but also how they were digitized. This variability makes it unlikely that an ML model trained on one dataset will perform successfully on another dataset. This contribution studies how such differences impact that performance. Based on our results and insights, we aim to influence future data acquisition standards and provide a solid foundation for future cuneiform sign classification tasks. The ML model has been trained and tested on handwritten Old Babylonian (c. 2000-1600 B.C.E.) documentary texts inscribed on clay tablets originating from three Mesopotamian cities (Nippur, Dūr-Abiešuh and Sippar). The presented and analysed model is ResNet50, which achieves a top-1 score of 87.1% and a top-5 score of 96.5% for signs with at least 20 instances. As these automatic classification results are the first on Old Babylonian texts, there are currently no comparable results.

Problem

Research questions and friction points this paper is trying to address.

Classifying Old Babylonian cuneiform signs using ML techniques

Addressing variability impact on ML model performance across datasets

Establishing foundations for future cuneiform sign classification standards

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses ResNet50 for cuneiform sign classification

Trains ML on Old Babylonian documentary texts

Achieves 87.1% top-1 accuracy on signs

🔎 Similar Papers

No similar papers found.