🤖 AI Summary
To address the longstanding reliance on manual annotation, scarcity of labeled data, and lack of dedicated methods for ancient plant seed classification in archaeobotany, this paper introduces APS—the first archaeological plant seed image benchmark dataset—comprising 8,340 images from 18 Chinese archaeological sites and covering 17 seed taxa. We further propose APSNet, a specialized deep learning model featuring two key innovations: (1) a Size Perception and Embedding (SPE) module that explicitly encodes seed-scale priors, and (2) an Asynchronous Decoupled Decoding (ADD) architecture enabling independent channel- and spatial-wise feature decoding. By integrating fine-grained visual representation learning with multi-scale feature encoding, APSNet achieves 90.5% top-1 accuracy on APS, substantially outperforming existing state-of-the-art methods. This work establishes the first scalable, high-accuracy intelligent analysis benchmark and toolkit for archaeological plant seed identification.
📝 Abstract
Understanding the dietary preferences of ancient societies and their evolution across periods and regions is crucial for revealing human-environment interactions. Seeds, as important archaeological artifacts, represent a fundamental subject of archaeobotanical research. However, traditional studies rely heavily on expert knowledge, which is often time-consuming and inefficient. Intelligent analysis methods have made progress in various fields of archaeology, but there remains a research gap in data and methods in archaeobotany, especially in the classification task of ancient plant seeds. To address this, we construct the first Ancient Plant Seed Image Classification (APS) dataset. It contains 8,340 images from 17 genus- or species-level seed categories excavated from 18 archaeological sites across China. In addition, we design a framework specifically for the ancient plant seed classification task (APSNet), which introduces the scale feature (size) of seeds based on learning fine-grained information to guide the network in discovering key "evidence" for sufficient classification. Specifically, we design a Size Perception and Embedding (SPE) module in the encoder part to explicitly extract size information for the purpose of complementing fine-grained information. We propose an Asynchronous Decoupled Decoding (ADD) architecture based on traditional progressive learning to decode features from both channel and spatial perspectives, enabling efficient learning of discriminative features. In both quantitative and qualitative analyses, our approach surpasses existing state-of-the-art image classification methods, achieving an accuracy of 90.5%. This demonstrates that our work provides an effective tool for large-scale, systematic archaeological research.