In Submission
This study is currently in the public release and publisher submission process. Estimated publication date is February 2025.
Code
The accompanying codebase for this study can be found at: Samplify
Abstract
Twenty-four years after Friedl et al. first identified the challenges of developing and assessing remote sensing models with small datasets, one key issue persists: the misuse of random sampling to generate training and testing data. This practice often introduces a high degree of correlation between the sets, leading to an overestimation of model generalizability. Despite the early recognition of this problem, few researchers have investigated its nuances or developed effective sampling techniques to address it. Our survey highlights that mitigation strategies to reduce this bias remain underutilized in practice, distorting the interpretation and comparison of results across the field. In this work, we introduce a set of desirable characteristics to evaluate sampling algorithms, with a primary focus on their tendency to induce correlation between training and test data, while also accounting for other relevant factors. Using these characteristics, we survey 146 articles, identify 16 unique sampling algorithms, and evaluate them. Our evaluation reveals two broad archetypes of sampling techniques that effectively mitigate correlation and are suitable for model development.
Disclaimer
The views expressed in this research are those of the authors and do not reflect the official policy or position of the United States government, Department of Defense, or the U.S. Air Force.