Floralens: a Deep Learning Model for the Portuguese Native Flora

📅 2024-02-13
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-quality benchmark datasets and lightweight models for native Portuguese plant image recognition are lacking. Method: We introduce Floralens—the first authoritative, multi-source (Botanical Society of Portugal + GBIF), publicly available image dataset for native Portuguese flora—alongside an efficient deep convolutional neural network model trained via Google AutoML Vision. Our methodology includes rigorous data cleaning, cross-source annotation alignment, and an end-to-end open, reproducible pipeline. Contribution/Results: The model achieves accuracy comparable to Pl@ntNet and is deployed online on Project Biolens. Both the dataset and model are openly released via Zenodo. This work fills a critical gap in Iberian Peninsula plant intelligence—providing the first dedicated dataset and deployable model for native flora—and establishes a scalable, citizen-science–enabled paradigm for biodiversity monitoring.

Technology Category

Application Category

📝 Abstract
Machine-learning techniques, especially deep convolutional neural networks, are pivotal for image-based identification of biological species in many Citizen Science platforms. In this paper, we describe the construction of a dataset for the Portuguese native flora based on publicly available research-grade datasets, and the derivation of a high-accuracy model from it using off-the-shelf deep convolutional neural networks. We anchored the dataset in high-quality data provided by Sociedade Portuguesa de Bot^anica and added further sampled data from research-grade datasets available from GBIF. We find that with a careful dataset design, off-the-shelf machine-learning cloud services such as Google's AutoML Vision produce accurate models, with results comparable to those of Pl@ntNet, a state-of-the-art citizen science platform. The best model we derived, dubbed Floralens, has been integrated into the public website of Project Biolens, where we gather models for other taxa as well. The dataset used to train the model is also publicly available on Zenodo.
Problem

Research questions and friction points this paper is trying to address.

Develops deep learning model for Portuguese flora identification
Creates dataset from public research-grade botanical sources
Compares model accuracy with state-of-the-art platforms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep convolutional neural networks for species identification
Dataset construction from public research-grade sources
Integration with Google's AutoML Vision for accuracy
🔎 Similar Papers
No similar papers found.
A
Ant'onio Filgueiras
Department of Computer Science, Faculty of Sciences, University of Porto
Eduardo R. B. Marques
Eduardo R. B. Marques
Department of Computer Science, Faculty of Sciences, University of Porto, CRACS/INESC-TEC
L
Lu'is M. B. Lopes
Department of Computer Science, Faculty of Sciences, University of Porto, CRACS/INESC-TEC
M
Miguel Marques
Department of Computer Science, Faculty of Sciences, University of Porto
H
Hugo Silva
Department of Computer Science, Faculty of Sciences, University of Porto