Essential Site Maintenance: Authorea-powered sites will be updated circa 15:00-17:00 Eastern on Tuesday 5 November.
There should be no interruption to normal services, but please contact us at [email protected] in case you face any issues.

Sayantan Majumdar

and 11 more

High-resolution mapping and monitoring of global inland surface water bodies are critical to address challenges in sustainable water management practices. Planet currently operates the largest constellation of Earth Observation satellites and collects images at very high spatial (0.5 m - 5 m) and temporal (near-daily) resolutions. Here, we use PlanetScope data (resampled to 3 m) to develop a holistic and fully automated pipeline running on the Google Cloud Platform for monitoring global inland surface water. We incorporate the openly-available Global Reservoir and Dam (GRanD) data set into a three-stage supervised learning approach which initiates with an unsupervised label-generation step consisting of k-means clustering and NIR-based thresholding. We then rank the labels generated from these steps and the water labels extracted from the latest ESRI 10 m land cover data based on image contours. The best (noisy) labels having the least number of contours from this unsupervised learning stage are bootstrapped to train a deep-learning based semantic segmentation model (U-Net) on a KubeFlow pipeline. We subsequently create a new refined dataset by using these model predictions as labels which are passed to a Stochastic Gradient Descent (SGD)-based multi-temporal supervised label refinement stage (SGD classifier running on the same label for multiple input scenes). Finally, we iterate over the SGD based-supervised and U-Net-based label refinement steps to successively denoise the bootstrapped data until we obtain an acceptable test accuracy (F1 score > 0.9). Visual inspection of the results obtained over different climatic regions, terrains, and seasons across the globe shows that our approach works quite well. We also aggregate these predictions to detect temporal changes in surface water area. However, the model predictions exhibit high uncertainty in agricultural areas and complex terrains characterized by hill shadows and clouds. This issue could potentially be mitigated using hard-negative mining. Nevertheless, with the nearly-daily imaging capability of Planet, the high-fidelity surface water maps developed using this proposed supervised learning approach could be beneficial to the global water community for dealing with water security issues as part of the UN sustainable development goals.

Sayantan Majumdar

and 4 more

Groundwater is the largest source of Earth’s liquid freshwater and plays a critical role in global food security. With the rising global demand for drinking water and increased agricultural production, overuse of groundwater resources is a major concern. Because groundwater withdrawals are not monitored in most regions with the highest use, methods are needed to monitor withdrawals at a scale suitable for implementing sustainable management practices. In this study, we combine publicly available datasets into a machine learning framework for estimating groundwater withdrawals over the state of Arizona. This extends a previous study in which we estimated groundwater withdrawals in Kansas, where the climatic conditions and aquifer characteristics are significantly different. Datasets used in our model include energy-balance (SSEBop) and crop coefficient evapotranspiration estimates, precipitation(PRISM), and land-use (USDA-NASS Cropland Data Layer), and a watershed stress metric. Random forests, a widely popular machine learning algorithm, are employed for predicting groundwater withdrawals from 2002-2018 at 5 km spatial resolution. We used in-situ groundwater withdrawals available over the Arizona Active Management Area (AMA) and Irrigation Non-Expansion Area (INA) from 2002-2010 for training and 2011-2018 for validating the model respectively. The results show high training (R2 ≈ 0.98) and good testing (R2 ≈ 0.82) scores with low normalized mean absolute error ≈ 0.28 and root mean square error ≈ 1.28 for the AMA/INA region. Using this method, we are able to spatially extend estimates of groundwater withdrawals to the whole state of Arizona. We also observed that land subsidence in Arizona is predominantly occurring in areas having high yearly groundwater withdrawals of at least 100 mm per unit area. Our model shows promising results in sub-humid and semi-arid (Kansas) and arid regions (Arizona), which proves the robustness and extensibility of our integrated approach combining remote sensing and machine learning into a holistic, automated, and fully-reproducible workflow. The success of this method indicates that it could be extended to areas with more limited groundwater withdrawal data under different climatic conditions and aquifer properties.

Sayantan Majumdar

and 6 more

In this study, we improve estimates of groundwater usage across the Mississippi Alluvial Plain (MAP) in support of an ongoing USGS effort to model the groundwater resources of the region. Previously, the USGS developed a lookup table based on flowmeter data that estimates water use based on average water use for each crop type, for specific regions and precipitation amounts. The latest iteration of this model is known as Aquaculture and Irrigation Water-Use Model (AIWUM) 1.1 and we refer to our method as AIWUM 2.0. Here, we apply gradient boosted trees (GBT) to predict groundwater use across the MAP from 2014-2019. The predictor variables include well locations (latitude and longitude), crop type, precipitation, maximum temperature (the average daily maximum from April - September), total evapotranspiration estimated with MOD16, and surface run-off (TerraClimate). The existing flowmeter data over the Mississippi Delta were randomly split into training (80%) and validation (20%) data. The following parameters in the GBT algorithm were tuned: the number of estimators, learning rate, maximum tree depth, the objective function, and the percentage of training data that are randomly sampled in the training process. We observe very low model overfitting where the training error metrics are R2 = 0.58, mean absolute error (MAE) = 0.30 ft, and root mean square error (RMSE) = 0.51 ft, respectively, and the corresponding test metrics are R2 = 0.49, MAE = 0.32 ft, and RMSE = 0.51 ft. This is an improvement over AIWUM 1.1, where the corresponding R2, MAE, and RMSE were 0.27, 0.40 ft, and 0.67 ft. These water use estimates will result in an improved ability to accurately model groundwater flow in this aquifer, which accounts for roughly 20% of the total groundwater pumping in the United States.

Sayantan Majumdar

and 3 more

Effective monitoring of groundwater withdrawals is necessary to help mitigate the negative impacts of aquifer depletion. In this study, we develop a holistic approach that combines water balance components with a machine learning model to estimate groundwater withdrawals. We use both multi-temporal satellite and modeled data from sensors that measure different components of the water balance at varying spatial and temporal resolutions. These remote sensing products include evapotranspiration, precipitation, and land cover. Due to the inherent complexity of integrating these data sets and subsequently relating them to groundwater withdrawals using physical models, we apply random forests- a state of the art machine learning algorithm- to overcome such limitations. Here, we predict groundwater withdrawals per unit area over a highly monitored portion of the High Plains aquifer in the central United States at 5 km resolution for the years 2002-2019. Our modeled withdrawals had high accuracy on both training and testing datasets (R≈ 0.99 and R≈ 0.93, respectively) during leave-one-out (year) cross-validation with low Mean Absolute Error (MAE) ≈ 4.26 mm and Root Mean Square Error (RMSE) ≈ 13.57 mm for the year 2014. Moreover, we found that even for the extreme drought year of 2012, we have a satisfactory test score (R≈ 0.79) with MAE ≈ 10.34 mm and RMSE ≈ 27.04 mm. Therefore, the proposed hybrid water balance and machine learning approach can be applied to similar regions for proactive water management practices.

Sayantan Majumdar

and 3 more

Groundwater plays a crucial role in sustaining global food security but is being over-exploited in many basins of the world. Despite its importance and finite availability, local-scale monitoring of groundwater withdrawals required for sustainable water management practices is not carried out in most countries, including the United States. In this study, we combine publicly available datasets into a machine learning framework for estimating groundwater withdrawals over the state of Arizona. Here we include evapotranspiration, precipitation, crop coefficients, land use, well density, and watershed stress metrics for our predictions. We employ random forests to predict groundwater withdrawals from 2002-2020 at a 2 km spatial resolution using in-situ groundwater withdrawal data available for Arizona Active Management Areas (AMA) and Irrigation Non-Expansion Areas (INA) from 2002-2009 for training and 2010-2020 for validating the model respectively. The results show high training (R2≈ 0.86) and good testing (R2≈ 0.69) scores with normalized mean absolute error (NMAE) ≈ 0.64 and normalized root mean square error (NRMSE) ≈ 2.36 for the AMA/INA region. Using this method, we spatially extrapolate the existing groundwater withdrawal estimates to the entire state and observe the co-occurrence of both groundwater withdrawals and land subsidence in South-Central and Southern Arizona. Our model predicts groundwater withdrawals in regions where production wells are present on agricultural lands and subsidence is observed from Interferometric Synthetic Aperture Radar (InSAR), but withdrawals are not monitored. By performing a comparative analysis over these regions using the predicted groundwater withdrawals and InSAR-based land subsidence estimates, we observe a varying degree of subsidence for similar volumes of withdrawals in different basins. The performance of our model on validation datasets and its favorable comparison with independent water use proxies such as InSAR demonstrate the effectiveness and extensibility of our combined remote sensing and machine learning-based approach.