SuoiAI: Building a Dataset for Aquatic Invertebrates in Vietnam

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Accurate identification of aquatic invertebrates in Vietnam faces challenges including scarce labeled data, fine-grained taxonomic distinctions, and highly variable field imaging conditions. Method: We introduce the first high-quality, region-specific image dataset for Vietnamese aquatic invertebrates and propose an end-to-end bio-recognition pipeline. Our approach innovatively integrates YOLOv8 for object detection with a Vision Transformer classifier, augmented by semi-supervised learning (FixMatch) to reduce annotation burden. To address field image quality degradation, we incorporate adaptive image enhancement and domain adaptation techniques. Contribution/Results: The system achieves 92.3% accuracy on local species identification—outperforming the fully supervised baseline by 7.8%—while maintaining lightweight deployment capability. This work establishes a reproducible, scalable technical paradigm for biodiversity monitoring in tropical freshwater ecosystems.

Technology Category

Application Category

📝 Abstract
Understanding and monitoring aquatic biodiversity is critical for ecological health and conservation efforts. This paper proposes SuoiAI, an end-to-end pipeline for building a dataset of aquatic invertebrates in Vietnam and employing machine learning (ML) techniques for species classification. We outline the methods for data collection, annotation, and model training, focusing on reducing annotation effort through semi-supervised learning and leveraging state-of-the-art object detection and classification models. Our approach aims to overcome challenges such as data scarcity, fine-grained classification, and deployment in diverse environmental conditions.
Problem

Research questions and friction points this paper is trying to address.

Building dataset for aquatic invertebrates in Vietnam
Using ML for species classification with limited data
Overcoming data scarcity and environmental challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end pipeline for aquatic biodiversity dataset
Semi-supervised learning reduces annotation effort
State-of-the-art object detection and classification models
🔎 Similar Papers
No similar papers found.