Current machine learning methods for discharge prediction often employ aggregated basin-wide hydrometeorological data (lumped modeling) for parametric and non-parametric training. This approach may overlook the spatial heterogeneity of river systems and their impact on discharge patterns. We hypothesize that integrating temporal-spatial hydrologic knowledge into the data modeling process (distributed/disaggregated modeling) can improve the performance of discharge prediction models. To test this hypothesis, we designed experiments comparing the performance of identical Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) models forced with either lumped or distributed features. We gather meteorological forcing and static attributes for the Mackenzie basin in Canada- a large and unique basin. Importantly, discharge performance is assessed out-of-sample with k-fold replication across gauges. Results reveal a 9.6% improvement in the mean Nash-Sutcliffe Efficiency (NSE) and a 4.6% improvement in mean Kling-Gupta Efficiency (KGE) when LSTMs are trained with distributed information. Notably, the models exhibit consistently unbiased predictions, with a negligible relative bias (RBias ≈ 0.0) across all predictions. These experiments and results demonstrate the importance of integrating topologically guided geomorphologic and hydrologic information (distributed modeling) in data-driven discharge predictions.