2.1 Search methodology
We reviewed journal papers published in the last two decades (2001-2021) that use MLSMs for WDSs and UDSs. We established two main search criteria: surrogate modelling and water networks. Since both topics have a multiplicity of names, each of them was represented by a set of keywords. For surrogate modelling, the search terms were: “Surrogate model*”, “Metamodel*”, “Response surface”, “model emulation”, and “hybrid model”. In the case of water networks, the search terms referred to both water distribution and drainage systems along with popular software for their analysis, “Water distribution”, “Water supply”, “Drinking water”, “Urban drainage”, “Wastewater”, “Sewer”, “Sewerage”, “EPANET”, “WaterCAD”, “SWMM”, “SOBEK”, and “Urban water”.
For the search, we employed the SCOPUS database. By intersecting the search terms, we identified an initial set of 64 papers that were further filtered to only include ML applications, yielding a total of 31 papers to review. Next, we searched through the citations of the selected set of papers and other relevant papers in the field (i.e., Maier et al., 2014; Maier & Dandy, 2000; Razavi et al., 2012b) for further references. However, the original set already contained the cited papers. Therefore, the results are equivalent to the keyword search. This validates the thoroughness of the original search and makes the methodology more replicable by avoiding arbitrarily selected papers.
This list of papers may not be totally inclusive since some studies do not use the formal terminology of surrogate modelling, as indicated by Razavi et al. (2012b). Nevertheless, the purpose of this paper is to depict the recent state-of-the-art, identify gaps in knowledge and propose future research directions. We believe that the selected set of papers is sufficient to achieve this goal.