🤖 AI Summary
To address the challenge of balancing interpretability and discriminative power in spectral indices for vegetation remote sensing classification, this paper proposes a novel method for automatically constructing concise, illumination-invariant polynomial spectral indices. Starting from the normalized difference as a fundamental building block, we perform polynomial expansion to generate physically meaningful candidate features. We then integrate ANOVA-based filtering, recursive feature elimination, and L1-regularized SVM to achieve rigorous, statistically driven sparse feature selection—marking the first deep integration of polynomial index construction with data-driven feature selection. The resulting minimal indices (e.g., b5×b6 using Sentinel-2 red-edge bands) are derived exclusively from interactions among Sentinel-2 bands b4–b8. In the Kochia identification task, a single optimized index achieves 96.26% accuracy, while an ensemble of eight indices attains 97.70%. All indices are lightweight and fully compatible with real-time deployment on Google Earth Engine.
📝 Abstract
We introduce an automated way to find compact spectral indices for vegetation classification. The idea is to take all pairwise normalized differences from the spectral bands and then build polynomial combinations up to a fixed degree, which gives a structured search space that still keeps the illumination invariance needed in remote sensing. For a sensor with $n$ bands this produces $inom{n}{2}$ base normalized differences, and the degree-2 polynomial expansion gives 1,080 candidate features for the 10-band Sentinel-2 configuration we use here. Feature selection methods (ANOVA filtering, recursive elimination, and $L_1$-regularized SVM) then pick out small sets of indices that reach the desired accuracy, so the final models stay simple and easy to interpret. We test the framework on Kochia ( extit{Bassia scoparia}) detection using Sentinel-2 imagery from Saskatchewan, Canada ($N = 2{,}318$ samples, 2022--2024). A single degree-2 index, the product of two normalized differences from the red-edge bands, already reaches 96.26% accuracy, and using eight indices only raises this to 97.70%. In every case the chosen features are degree-2 products built from bands $b_4$ through $b_8$, which suggests that the discriminative signal comes from spectral emph{interactions} rather than individual band ratios. Because the indices involve only simple arithmetic, they can be deployed directly in platforms like Google Earth Engine. The same approach works for other sensors and classification tasks, and an open-source implementation ( exttt{ndindex}) is available.