Kriging and Gaussian Process Interpolation for Georeferenced Data Augmentation

📅 2025-01-13

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Sparse occurrence records and limited georeferenced data hinder accurate spatial distribution modeling of *Commelina benghalensis* L. in sugarcane fields on Réunion Island. Method: We propose a spatial interpolation–based data augmentation framework, systematically comparing Gaussian process regression (GPR) — using RBF, Matérn, and a novel composite kernel (GP-COMB) — against ordinary kriging with multiple variogram models. Spatial cross-validation is employed to rigorously assess generalization performance. Contribution/Results: This study provides the first quantitative comparison of GPR and kriging for agricultural weed mapping, explicitly characterizing the trade-off between prediction accuracy and spatial consistency. GP-COMB achieves substantial performance gains with minimal additional sampling, while kriging yields more spatially uniform synthetic samples despite marginally lower average accuracy. The proposed framework establishes a reproducible, spatially aware data augmentation paradigm for small-sample geographic and ecological modeling.

Technology Category

Application Category

📝 Abstract

Data augmentation is a crucial step in the development of robust supervised learning models, especially when dealing with limited datasets. This study explores interpolation techniques for the augmentation of geo-referenced data, with the aim of predicting the presence of Commelina benghalensis L. in sugarcane plots in La R{'e}union. Given the spatial nature of the data and the high cost of data collection, we evaluated two interpolation approaches: Gaussian processes (GPs) with different kernels and kriging with various variograms. The objectives of this work are threefold: (i) to identify which interpolation methods offer the best predictive performance for various regression algorithms, (ii) to analyze the evolution of performance as a function of the number of observations added, and (iii) to assess the spatial consistency of augmented datasets. The results show that GP-based methods, in particular with combined kernels (GP-COMB), significantly improve the performance of regression algorithms while requiring less additional data. Although kriging shows slightly lower performance, it is distinguished by a more homogeneous spatial coverage, a potential advantage in certain contexts.

Problem

Research questions and friction points this paper is trying to address.

Kriging

Gaussian Process Interpolation

Commelina benghalensis Prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Processes

Kriging

Spatial Data Enhancement

🔎 Similar Papers

Data augmentation with automated machine learning: approaches and performance comparison with classical data augmentation methods