A continental-scale dataset of ground beetles with high-resolution images and validated morphological trait measurements

πŸ“… 2026-01-14
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing global trait databases exhibit a strong bias toward vertebrates and plants, largely overlooking highly diverse invertebrate groups such as ground beetles (Carabidae), and their reliance on physical specimens constrains large-scale analyses. This study addresses this gap by generating high-resolution images of over 13,200 NEON-collected carabid specimens from 30 sites across the continental United States and Hawaii. Integrating digital morphometrics with manual validation, we achieve, for the first time, large-scale extraction of elytral length and width at sub-millimeter precision. The resulting open-access, multimodal dataset fills a critical void in invertebrate trait data and provides a foundational resource for AI-driven automated species identification and ecological research.

Technology Category

Application Category

πŸ“ Abstract
Despite the ecological significance of invertebrates, global trait databases remain heavily biased toward vertebrates and plants, limiting comprehensive ecological analyses of high-diversity groups like ground beetles. Ground beetles (Coleoptera: Carabidae) serve as critical bioindicators of ecosystem health, providing valuable insights into biodiversity shifts driven by environmental changes. While the National Ecological Observatory Network (NEON) maintains an extensive collection of carabid specimens from across the United States, these primarily exist as physical collections, restricting widespread research access and large-scale analysis. To address these gaps, we present a multimodal dataset digitizing over 13,200 NEON carabids from 30 sites spanning the continental US and Hawaii through high-resolution imaging, enabling broader access and computational analysis. The dataset includes digitally measured elytra length and width of each specimen, establishing a foundation for automated trait extraction using AI. Validated against manual measurements, our digital trait extraction achieves sub-millimeter precision, ensuring reliability for ecological and computational studies. By addressing invertebrate under-representation in trait databases, this work supports AI-driven tools for automated species identification and trait-based research, fostering advancements in biodiversity monitoring and conservation.
Problem

Research questions and friction points this paper is trying to address.

trait database
invertebrate under-representation
ground beetles
ecological analysis
specimen accessibility
Innovation

Methods, ideas, or system contributions that make the work stand out.

high-resolution imaging
digital trait extraction
automated morphological measurement
AI-driven biodiversity monitoring
multimodal dataset
πŸ”Ž Similar Papers
No similar papers found.
S
S. M. Rayeed
Rensselaer Polytechnic Institute, Department of Computer Science, Troy NY , 12180, USA
Mridul Khurana
Mridul Khurana
Virginia Tech
Computer VisionMachine LearningGenerative AIAI for Science
Alyson East
Alyson East
University of Maine
Landscape EcologyRemote SensingBiodiversity
I
I. Fluck
University of Florida, Department of Wildlife Ecology and Conservation, Gainesville FL, 32611, USA
E
Elizabeth G. Campolongo
The Ohio State University, Department of Computer Science and Engineering, Columbus OH, 43210, USA
Samuel Stevens
Samuel Stevens
PhD student, The Ohio State University
Natural language processing
I
Iuliia Zarubiieva
Vector Institute, Toronto ON, M5G 0C6, Canada
Scott C. Lowe
Scott C. Lowe
Postdoctoral Research Fellow, Vector Institute
Machine LearningDeep learningNeuroinformaticsSelf-supervisionReasoning
Michael Denslow
Michael Denslow
SERNEC (SouthEast Regional Network of Expertise and Collections)
Biodiversity InformaticsEcologyBiogeography
E
Evan D. Donoso
National Ecological Observatory Network, Pu’u Maka’ala Natural Area Reserve, Hilo HI, 96720, USA
Jiaman Wu
Jiaman Wu
The Ohio State University
machine learningdeep learning
M
Michelle Ramirez
The Ohio State University, Department of Computer Science and Engineering, Columbus OH, 43210, USA
B
Benjamin Baiser
University of Florida, Department of Wildlife Ecology and Conservation, Gainesville FL, 32611, USA
Charles V. Stewart
Charles V. Stewart
Professor of Computer Science, Rensselaer Polytechnic Institute
Computer VisionMachine LearningApplications in Wildlife Ecology
P
P. Mabee
Battelle, National Ecological Observatory Network, Boulder, CO 80301, USA
T
Tanya Y. Berger-Wolf
The Ohio State University, Imageomics Institute & ABC Global Climate Center, Columbus OH, 43210, USA
A
A. Karpatne
Virginia Tech, Department of Computer Science, Blacksburg VA, 24061, USA
H
H. Lapp
Duke University, Department of Biostatistics and Bioinformatics, Durham NC, 27708, USA
R
Robert P. Guralnick
University of Florida, Florida Museum of Natural History, Gainesville FL, 32611, USA
Graham Taylor
Graham Taylor
University of Guelph and Vector Institute for Artificial Intelligence
Machine Learning
Sydne Record
Sydne Record
Professor, University of Maine
BiogeographyCommunity Ecology