loading page

High-resolution mapping of SOC at different subsidence stages in high groundwater level mining areas using machine learning and model fusion techniques
  • +6
  • Lingtong Meng,
  • Xiangyu Min,
  • Shuzhen Cao,
  • Z Hu,
  • Kexin Liu,
  • Haoyu Wang,
  • Qian Sun,
  • Ju LI,
  • Dongyun Xu
Lingtong Meng
China University of Mining and Technology School of Environment Science and Spatial Informatics
Author Profile
Xiangyu Min
Shandong Agricultural University
Author Profile
Shuzhen Cao
Yantai Transportation Service Center
Author Profile
Z Hu
China University of Mining and Technology School of Environment Science and Spatial Informatics
Author Profile
Kexin Liu
Shandong Agricultural University
Author Profile
Haoyu Wang
Shandong Agricultural University
Author Profile
Qian Sun
Shandong Agricultural University
Author Profile
Ju LI
Shandong Agricultural University
Author Profile
Dongyun Xu
Shandong Agricultural University

Corresponding Author:[email protected]

Author Profile

Abstract

Accurately estimating the spatial distribution of soil organic carbon (SOC) in coal mining regions is crucial for soil quality restoration and understanding global carbon cycling. Given the complex mechanisms influencing SOC in coal mining areas, research on the dynamic and high-precision digital analysis of SOC content before and after subsidence and reclamation in high groundwater mining sites remains limited. In this study, we employed four machine learning algorithms—Cubist, Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost)—in conjunction with a model fusion technique to analyze SOC content across various subsidence stages in high groundwater mining areas: control land (CL), subsided land (SL), and reclaimed land (RL). By integrating high-resolution imagery from China’s GF-1 satellite, we generated a predictive map of surface SOC content. Additionally, we utilized an optimal parameter-based geographical detector (OPGD) model to quantitatively identify the key factors driving SOC spatial variation within the study area. Our results indicate that the fusion model combining RF and Cubist outperformed the others, achieving a coefficient of determination (R 2) of 0.73, a root mean square error (RMSE) of 0.73 g/kg, and a ratio of performance to interquartile distance (RPIQ) of 2.50. The predictive map highlights that high SOC concentrations in the mining area are predominantly found in reclaimed lands. Organism-related factors emerged as the strongest explanatory variables for SOC content in these areas and constituted the most critical dataset in our model development. This cost-effective, high-efficiency approach offers valuable insights into SOC research and informs strategies for soil remediation in mining-affected lands.