🤖 AI Summary
To address the high storage/computation costs and technical barriers hindering deep learning adoption in macroecology, this paper proposes a climate implicit encoding paradigm and introduces a lightweight climate implicit embedding model for global ecological tasks. The model employs an implicit neural representation (INR)-based spatiotemporal geocoder that directly generates climate representations for arbitrary spatiotemporal coordinates—eliminating the need to download raw raster data or train dedicated feature extractors. It follows a pretraining + linear probing framework, ensuring compatibility with heterogeneous climate variables and diverse downstream ecological tasks. Experiments demonstrate that our method matches or exceeds end-to-end trained models in biome classification, species distribution modeling, and plant trait regression, while substantially outperforming conventional geographic encoding approaches. It reduces storage requirements by approximately three orders of magnitude, significantly lowering the entry barrier for ecologists without deep learning expertise.
📝 Abstract
Deep learning on climatic data holds potential for macroecological applications. However, its adoption remains limited among scientists outside the deep learning community due to storage, compute, and technical expertise barriers. To address this, we introduce Climplicit, a spatio-temporal geolocation encoder pretrained to generate implicit climatic representations anywhere on Earth. By bypassing the need to download raw climatic rasters and train feature extractors, our model uses x1000 fewer disk space and significantly reduces computational needs for downstream tasks. We evaluate our Climplicit embeddings on biomes classification, species distribution modeling, and plant trait regression. We find that linear probing our Climplicit embeddings consistently performs better or on par with training a model from scratch on downstream tasks and overall better than alternative geolocation encoding models.