Abstract
The use of the parameters associated with the “best-fit” criterion to
represent a calibrated hydrological model is inadequate. Furthermore,
assessing the goodness of model calibration or validation based on
performance criteria, such as NSE, R2, or PBIAS, is
misleading because they only compare two signals, i.e., measurement and
the best-fit simulation (i.e., simulation with the best objective
function value). The reason is that the calibrated model’s best
objective function value is usually not significantly different from the
next best value or the value after that. This non-uniqueness of the
objective function causes a problem because the best solution’s
parameters are always significantly different from the next best
parameters. Therefore, only using the best simulation parameters as the
calibrated model’s sole parameters to interpret the watershed processes
or perform further model analyses could lead to erroneous results.
Furthermore, most watersheds are increasingly changing due to human
activities. The lack of pristine watersheds makes the task of
watershed-scale calibration increasingly challenging. Subjective
thresholds of acceptable performance criteria suggested by some
researchers to rate the goodness of calibration are based on the
comparison of the two signals, and in most cases, the thresholds are not
achievable. Hence, to obtain a satisfactory fit, researchers and
practitioners are forced to massage and manipulate the input or
simulated data, compromising the science behind their work. This article
discusses the fallacy in using the “best-fit” solution in hydrologic
modeling. It introduces a two-factor statistics to assess the goodness
of calibration/validation while taking model output uncertainty into
account.