🤖 AI Summary
Wildlife individual re-identification suffers from scarce image data and poor model interpretability. Method: This work pioneers the integration of forensic-inspired dermatoglyphic textual descriptions—e.g., topological semantic labels of stripe patterns—with visual features to build a cross-modal tiger individual retrieval system. Specifically: (1) it introduces interpretable dermatoglyphic semantic modeling into ecology for the first time; (2) proposes a text–image co-synthesis pipeline to generate synthetic tiger images with precise dermatoglyphic annotations; and (3) combines cross-modal contrastive learning with topological encoding of micro-features (84,264 minutiae). Results: Evaluated on 3,355 real tiger images from 185 individuals, the method significantly improves cross-modal retrieval accuracy using textual queries, effectively alleviating the few-shot bottleneck while enabling interpretable and verifiable individual identity tracing.
📝 Abstract
Biologists have long combined visuals with textual field notes to re-identify (Re-ID) animals. Contemporary AI tools automate this for species with distinctive morphological features but remain largely image-based. Here, we extend Re-ID methodologies by incorporating precise dermatoglyphic textual descriptors-an approach used in forensics but new to ecology. We demonstrate that these specialist semantics abstract and encode animal coat topology using human-interpretable language tags. Drawing on 84,264 manually labelled minutiae across 3,355 images of 185 tigers (Panthera tigris), we evaluate this visual-textual methodology, revealing novel capabilities for cross-modal identity retrieval. To optimise performance, we developed a text-image co-synthesis pipeline to generate 'virtual individuals', each comprising dozens of life-like visuals paired with dermatoglyphic text. Benchmarking against real-world scenarios shows this augmentation significantly boosts AI accuracy in cross-modal retrieval while alleviating data scarcity. We conclude that dermatoglyphic language-guided biometrics can overcome vision-only limitations, enabling textual-to-visual identity recovery underpinned by human-verifiable matchings. This represents a significant advance towards explainability in Re-ID and a language-driven unification of descriptive modalities in ecological monitoring.