Due to its substantial role on the Earth’s biogeochemical cycles and human health, nitrogen is recognized as one of the major water quality indicators of Sustainable Development Goal 6.3.2. Quantifying these potential impacts in large spatial scales still appears to be a grand challenge because of the high computational demand required by the distributed physically based global models and their intensive data requirements for calibration and validation. The former prevents a comprehensive analysis of the full spectrum of the model behavior under different conditions, and the latter impinges on the reliability of model-based inference. To tackle this problem, we developed a data-driven model using a spatio-temporal Random Forest algorithm to predict levels of nitrogen at 0.5-degree spatial resolution from 1992 to 2010 across the world. Several variables representing livestock, climate, hydrology, topography, etc. have been selected as predictors. The response variable of interest was nitrate–nitrite, which is responsible for the high risk of infant methemoglobinemia. Our results indicate that changes in the nitrogen concentration is mainly driven by cattle and sheep population, fertilizer application, precipitation, and temperature variability, implying livestock population, climate change, and anthropogenic forces can be important risk factors for global water quality deterioration. Furthermore, using the predicted levels of nitrogen, we characterized large-scale water quality patterns, and thus identified a few major ‘hot spots’ of water quality. The proposed model can also help assess potential impacts of future scenarios (e.g., livestock production or land use change) on global water quality conditions for better development of effective policy strategies.