Applying the maximum entropy principle to multi-species neural networks improves species distribution models

📅 2024-12-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Citizen science presence-only (PO) data suffer from severe sampling bias and lack explicit absence or detection-effort information. To address these challenges, we propose DeepMaxent: the first model embedding the maximum entropy principle into a multi-species shared deep neural network, enabling automatic joint learning of environmental covariates; and a normalized Poisson loss function that preserves statistical validity for PO modeling while scaling efficiently to large spatial grids. Integrating deep learning, point-process modeling, and multi-task learning, DeepMaxent significantly improves species distribution prediction accuracy. In benchmark evaluations across six biogeographic regions and multiple taxonomic groups, DeepMaxent consistently outperforms MaxEnt and other state-of-the-art species distribution modeling (SDM) methods—particularly in spatially biased sampling regimes—while exhibiting constant memory complexity independent of the number of spatial locations.

Technology Category

Application Category

📝 Abstract
The rapid expansion of citizen science initiatives has led to a significant growth of biodiversity databases, and particularly presence-only (PO) observations. PO data are invaluable for understanding species distributions and their dynamics, but their use in Species Distribution Models (SDM) is curtailed by sampling biases and the lack of information on absences. Poisson point processes are widely used for SDMs, with Maxent being one of the most popular methods. Maxent maximises the entropy of a probability distribution across sites as a function of predefined transformations of environmental variables, called features. In contrast, neural networks and deep learning have emerged as a promising technique for automatic feature extraction from complex input variables. In this paper, we propose DeepMaxent, which harnesses neural networks to automatically learn shared features among species, using the maximum entropy principle. To do so, it employs a normalised Poisson loss where for each species, presence probabilities across sites are modelled by a neural network. We evaluate DeepMaxent on a benchmark dataset known for its spatial sampling biases, using PO data for calibration and presence-absence (PA) data for validation across six regions with different biological groups and environmental covariates. Our results indicate that DeepMaxent improves model performance over Maxent and other state-of-the-art SDMs across regions and taxonomic groups. The method performs particularly well in regions of uneven sampling, demonstrating substantial potential to improve species distribution modelling. The method opens the possibility to learn more robust environmental features predicting jointly many species and scales to arbitrary large numbers of sites without an increased memory demand.
Problem

Research questions and friction points this paper is trying to address.

Species Distribution Prediction
Imbalanced Citizen Science Data
Absence Data Lack
Innovation

Methods, ideas, or system contributions that make the work stand out.

DeepMaxent
Neural Networks
Maximum Entropy Principle
🔎 Similar Papers
No similar papers found.
M
Maxime Ryckewaert
Inria, Univ Montpellier, Montpellier, France
Diego Marcos
Diego Marcos
Junior Professor at Inria, Montpellier
Machine LearningRemote Sensing
C
Christophe Botella
Inria, Univ Montpellier, Montpellier, France
Maximilien Servajean
Maximilien Servajean
LIRMM - UPVM
machine learningecologydata science
P
P. Bonnet
AMAP, Univ Montpellier, CIRAD, CNRS, INRAE, IRD, Montpellier, France
Alexis Joly
Alexis Joly
Research Director, Inria, Montpellier University, LIRMM
machine learningbiodiversityinformation retrievalplant identification