🤖 AI Summary
This study addresses the limitations of conventional heuristic thresholding methods in multi-species distribution modeling, which often distort estimates of species abundance and community composition when converting probabilistic predictions into binary presence–absence maps. To overcome this, the authors propose MaxExp, a novel framework that introduces decision-driven ensemble prediction optimization for multi-species binarization. MaxExp directly optimizes a target evaluation metric to select the optimal set of species without requiring calibration data. Complementing this approach, they develop the computationally efficient Set Size Expectation (SSE) method, which predicts community composition based on expected species richness. Evaluated across three case studies encompassing diverse taxa and rarity levels, MaxExp consistently matches or outperforms existing methods—particularly under extreme class imbalance—and SSE demonstrates competitive performance.
📝 Abstract
Species distribution models (SDMs) commonly produce probabilistic occurrence predictions that must be converted into binary presence-absence maps for ecological inference and conservation planning. However, this binarization step is typically heuristic and can substantially distort estimates of species prevalence and community composition. We present MaxExp, a decision-driven binarization framework that selects the most probable species assemblage by directly maximizing a chosen evaluation metric. MaxExp requires no calibration data and is flexible across several scores. We also introduce the Set Size Expectation (SSE) method, a computationally efficient alternative that predicts assemblages based on expected species richness. Using three case studies spanning diverse taxa, species counts, and performance metrics, we show that MaxExp consistently matches or surpasses widely used thresholding and calibration methods, especially under strong class imbalance and high rarity. SSE offers a simpler yet competitive option. Together, these methods provide robust, reproducible tools for multispecies SDM binarization.