A Machine learning technique for spatial interpolation of solar
radiation observations
Abstract
This paper applies statistical methods to interpolate missing values in
a dataset of radiative energy fluxes at the surface of Earth. We apply
Random Forest (RF) and seven other conventional spatial interpolation
models to a global Surface Solar Radiation (SSR) dataset. We apply three
categories of predictors; climatic, spatial, and time series variables.
Although the first category is the most common in research, our study
shows that it is actually the last two categories that are best suited
to predict the response. In fact, the best neighboring variable is
almost 40 times better than the best climatic variable in predicting
SSR. Furthermore, our analysis shows that the Mean Absolute Error is
10.2 on average using RF, with a standard deviation of 1.5. Conventional
methods have an average MAE of 21.3, with an average standard deviation
of 6.4. This highlights the benefits of using machine learning in
environmental research.