🤖 AI Summary
Archaeological predictive modeling frequently suffers from sampling bias due to spatially heterogeneous survey intensity, severely compromising model reliability. To address this, we propose a Shannon entropy–based bias correction framework: first quantifying regional survey coverage, then weighting pseudo-absence data proportionally to coverage to mitigate model bias in under-surveyed areas. The framework integrates entropy-driven weighting with interpretable, bias-aware modeling techniques—including Bayesian spatial logistic regression (R-INLA), generalized additive models, MaxEnt, and random forests. Empirical evaluation demonstrates that our approach significantly improves predictive accuracy and robustness in sparsely surveyed regions, while maintaining cross-model scalability. This work introduces the first information-theoretic, entropy-based general correction paradigm for systematic sampling bias in archaeological spatial prediction.
📝 Abstract
Predictive modeling in archaeology is essential for the understanding of people's behavior in the past and for guiding heritage conservation. However, spatial sampling bias caused by uneven research effort can severely limit model reliability. This research describes a novel new framework that integrates entropy-based corrections to measure and minimize such biases in archaeological modeling of foresight. Leveraging the open access data of the Grand Staircase-Escalante National Monument, we employ Shannon entropy to determine survey coverage and assign appropriate weights to pseudo-absence points. We combine these weights with predictive models such as Bayesian Spatial Logistic Regression (via R-INLA), Generalized Additive Models, Maximum Entropy and Random Forests. Our findings prove that entropy-aware models exhibit improved accuracy and robustness, especially for under-surveyed regions. This approach not only advances methodological transparency, but also improves the interpretation of archaeological prediction under conditions of data uncertainty. The proposed framework offers a scalable, theoretically grounded strategy for addressing spatial bias in archaeological datasets.