π€ AI Summary
Existing global trait databases exhibit a strong bias toward vertebrates and plants, largely overlooking highly diverse invertebrate groups such as ground beetles (Carabidae), and their reliance on physical specimens constrains large-scale analyses. This study addresses this gap by generating high-resolution images of over 13,200 NEON-collected carabid specimens from 30 sites across the continental United States and Hawaii. Integrating digital morphometrics with manual validation, we achieve, for the first time, large-scale extraction of elytral length and width at sub-millimeter precision. The resulting open-access, multimodal dataset fills a critical void in invertebrate trait data and provides a foundational resource for AI-driven automated species identification and ecological research.
π Abstract
Despite the ecological significance of invertebrates, global trait databases remain heavily biased toward vertebrates and plants, limiting comprehensive ecological analyses of high-diversity groups like ground beetles. Ground beetles (Coleoptera: Carabidae) serve as critical bioindicators of ecosystem health, providing valuable insights into biodiversity shifts driven by environmental changes. While the National Ecological Observatory Network (NEON) maintains an extensive collection of carabid specimens from across the United States, these primarily exist as physical collections, restricting widespread research access and large-scale analysis. To address these gaps, we present a multimodal dataset digitizing over 13,200 NEON carabids from 30 sites spanning the continental US and Hawaii through high-resolution imaging, enabling broader access and computational analysis. The dataset includes digitally measured elytra length and width of each specimen, establishing a foundation for automated trait extraction using AI. Validated against manual measurements, our digital trait extraction achieves sub-millimeter precision, ensuring reliability for ecological and computational studies. By addressing invertebrate under-representation in trait databases, this work supports AI-driven tools for automated species identification and trait-based research, fostering advancements in biodiversity monitoring and conservation.