On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction

📅 2024-04-19
🏛️ GIScience & Remote Sensing
📈 Citations: 0
Influential: 0
📄 PDF

career value

202K/year
🤖 AI Summary
This paper addresses the challenge of quantifying distributional dissimilarity between training samples and prediction locations in geospatial machine learning—a key factor undermining cross-validation (CV) reliability. We propose Difference-based Adversarial Validation (DAV), a method that trains a binary classifier in feature space to produce an interpretable 0–100% dissimilarity score. DAV is the first systematic application of adversarial validation to quantify predictive disparity in geospatial settings. We uncover a universal empirical relationship: CV effectiveness degrades monotonically with increasing DAV scores, identifying three regimes—RDM-CV dominates when DAV < 30%, SP-CV significantly outperforms RDM- and BLK-CV at medium-to-high DAV, and all standard CV strategies fail when DAV ≥ 90%. Extensive experiments on synthetic and real-world geospatial datasets, under diverse CV schemes (random, block, and spatially augmented), confirm DAV’s robustness across the full dissimilarity spectrum. DAV thus establishes an interpretable, reusable, and theoretically grounded evaluation paradigm for geospatial model validation.

Technology Category

Application Category

📝 Abstract
Recent geospatial machine learning studies have shown that the results of model evaluation via cross-validation (CV) are strongly affected by the dissimilarity between the sample data and the prediction locations. In this paper, we propose a method to quantify such a dissimilarity in the interval 0 to 100% and from the perspective of the data feature space. The proposed method is based on adversarial validation, which is an approach that can check whether sample data and prediction locations can be separated with a binary classifier. The proposed method is called dissimilarity quantification by adversarial validation (DAV). To study the effectiveness and general?ity of DAV, we tested it on a series of experiments based on both synthetic and real datasets and with gradually increasing dissimilarities. Results show that DAV effectively quantified dissimilarity across the entire range of values. Next to this, we studied how dissimilarity affects CV methods' evaluations by comparing the results of random CV method (RDM-CV) and of two geospatial CV methods, namely, block and spatial+ CV (BLK-CV and SP-CV). Our results showed the evaluations follow similar patterns in all datasets and predictions: when dissimilarity is low (usually lower than 30%), RDM-CV provides the most accurate evaluation results. As dissimilarity increases, geospatial CV methods, especially SP-CV, become more and more accurate and even outperform RDM-CV. When dissimilarity is high (>=90%), no CV method provides accurate evaluations. These results show the importance of considering feature space dissimilarity when working with geospatial machine learning predictions and can help researchers and practitioners to select more suitable CV methods for evaluating their predictions.
Problem

Research questions and friction points this paper is trying to address.

Quantify dissimilarity in geospatial machine learning.
Evaluate cross-validation methods under dissimilarity.
Assess adversarial validation effectiveness on datasets.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial validation for dissimilarity
Dissimilarity quantification method DAV
Geospatial CV methods comparison
🔎 Similar Papers
No similar papers found.
Y
Yanwen Wang
Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, 7522NH Enschede, the Netherlands
M
Mahdi Khodadadzadeh
Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, 7522NH Enschede, the Netherlands
R
Raúl Zurita-Milla
Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, 7522NH Enschede, the Netherlands