Video Compression for Spatiotemporal Earth System Data

📅 2025-06-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high storage and transmission costs of large-scale Earth system data—such as high-resolution remote sensing imagery and spatiotemporal climate model outputs—this paper proposes an efficient lossless/lossy compression framework leveraging standardized video coding. Methodologically, it maps multidimensional spatiotemporal data to video format and, for the first time, systematically integrates mainstream video codecs (e.g., H.265/HEVC) into the geoscience data processing ecosystem via the xarrayvideo library. The framework natively supports xarray-based multidimensional arrays as input and generates cloud-native TACO-format outputs. Evaluated on four real-world datasets, it achieves PSNRs of 40.60–65.91 dB at bitrates of 0.1–1 bpppb; DeepExtremeCubes and DynamicEarthNet are compressed by 8.5× and 61.8×, respectively, with no degradation in downstream deep learning task performance. The framework attains up to 250× compression while preserving high fidelity and machine-learning readiness.

Technology Category

Application Category

📝 Abstract
Large-scale Earth system datasets, from high-resolution remote sensing imagery to spatiotemporal climate model outputs, exhibit characteristics analogous to those of standard videos. Their inherent spatial, temporal, and spectral redundancies can thus be readily exploited by established video compression techniques. Here, we present xarrayvideo, a Python library for compressing multichannel spatiotemporal datasets by encoding them as videos. Our approach achieves compression ratios of up to 250x while maintaining high fidelity by leveraging standard, well-optimized video codecs through ffmpeg. We demonstrate the library's effectiveness on four real-world multichannel spatiotemporal datasets: DynamicEarthNet (very high resolution Planet images), DeepExtremeCubes (high resolution Sentinel-2 images), ERA5 (weather reanalysis data), and the SimpleS2 dataset (high resolution multichannel Sentinel-2 images), achieving Peak Signal-to-Noise Ratios (PSNRs) of 55.86, 40.60, 46.58, and 43.23 dB at 0.1 bits per pixel per band (bpppb) and 65.91, 54.28, 62.90, and 55.04 dB at 1 bpppb. We are redistributing two of these datasets, DeepExtremeCubes (2.3 Tb) and DynamicEarthNet (525 Gb), in the machine-learning-ready and cloud-ready TACO format through HuggingFace at significantly reduced sizes (270 Gb and 8.5 Gb, respectively) without compromising quality (PSNR 55.77-56.65 and 60.15). No performance loss is observed when the compressed versions of these datasets are used in their respective deep learning-based downstream tasks (next step reflectance prediction and landcover segmentation). In conclusion, xarrayvideo presents an efficient solution for handling the rapidly growing size of Earth observation datasets, making advanced compression techniques accessible and practical to the Earth science community. The library is available for use at https://github.com/IPL-UV/xarrayvideo
Problem

Research questions and friction points this paper is trying to address.

Compressing large Earth system datasets efficiently
Reducing storage size without losing data quality
Applying video compression to spatiotemporal climate data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses video compression for Earth data
Leverages ffmpeg for high efficiency
Maintains quality with 250x compression
🔎 Similar Papers
No similar papers found.