Overview of GeoLifeCLEF 2023: Species Composition Prediction with High Spatial Resolution at Continental Scale Using Remote Sensing

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of continental-scale plant species composition prediction. We propose a deep learning framework that jointly leverages single-label and multi-label training data, integrating high-resolution Sentinel-2 satellite imagery with multi-source environmental covariates—including land cover, climate, soil properties, elevation, and human footprint—to model multi-label species assemblages across 22,000 standardized European vegetation plots. Our key innovation is a dual-path training strategy designed to mitigate systematic bias arising from evaluating multi-label predictions under single-label supervision, thereby substantially improving model generalizability and spatial consistency in real ecological settings. Experiments demonstrate that our approach outperforms state-of-the-art methods in predictive accuracy, ecological plausibility, and cross-regional transferability. The framework establishes a scalable, remote sensing–driven paradigm for large-scale biodiversity monitoring.

Technology Category

Application Category

📝 Abstract
Understanding the spatio-temporal distribution of species is a cornerstone of ecology and conservation. By pairing species observations with geographic and environmental predictors, researchers can model the relationship between an environment and the species which may be found there. To advance the state- of-the-art in this area with deep learning models and remote sensing data, we organized an open machine learning challenge called GeoLifeCLEF 2023. The training dataset comprised 5 million plant species observations (single positive label per sample) distributed across Europe and covering most of its flora, high-resolution rasters: remote sensing imagery, land cover, elevation, in addition to coarse-resolution data: climate, soil and human footprint variables. In this multi-label classification task, we evaluated models ability to predict the species composition in 22 thousand small plots based on standardized surveys. This paper presents an overview of the competition, synthesizes the approaches used by the participating teams, and analyzes the main results. In particular, we highlight the biases faced by the methods fitted to single positive labels when it comes to the multi-label evaluation, and the new and effective learning strategy combining single and multi-label data in training.
Problem

Research questions and friction points this paper is trying to address.

Predicting species composition using remote sensing data
Modeling plant distribution across Europe with machine learning
Addressing single-label bias in multi-label classification tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using remote sensing data for species prediction
Multi-label classification with single positive labels
Combining single and multi-label data in training
C
Christophe Botella
INRIA, LIRMM, Univ Montpellier, CNRS, Montpellier, France
B
Benjamin Deneu
INRIA, LIRMM, Univ Montpellier, CNRS, Montpellier, France; AMAP, Univ Montpellier, CIRAD, CNRS, INRAE, IRD, Montpellier, France
Diego Marcos
Diego Marcos
Junior Professor at Inria, Montpellier
Machine LearningRemote Sensing
Maximilien Servajean
Maximilien Servajean
LIRMM - UPVM
machine learningecologydata science
T
Théo Larcher
INRIA, LIRMM, Univ Montpellier, CNRS, Montpellier, France
C
César Leblanc
INRIA, LIRMM, Univ Montpellier, CNRS, Montpellier, France; AMAP, Univ Montpellier, CIRAD, CNRS, INRAE, IRD, Montpellier, France
J
Joaquim Estopinan
INRIA, LIRMM, Univ Montpellier, CNRS, Montpellier, France; AMAP, Univ Montpellier, CIRAD, CNRS, INRAE, IRD, Montpellier, France
Pierre Bonnet
Pierre Bonnet
Professor, Institut Pascal, clermont université
Electromagnetic compatibilityMaxwellnumerical methodsFVTDbioelectromagnetism
Alexis Joly
Alexis Joly
Research Director, Inria, Montpellier University, LIRMM
machine learningbiodiversityinformation retrievalplant identification