At a smaller evaluation budget, we observe that U-NSGA-III consistently maintains a large pool of near-optimal solutions, as the bright region is seen nearer to the PF, while reporting a lower mean HV compared to qNEHVI in Figure 8 a) and e). Figure 8 b) for the Thin Film problem also corroborates our findings that qNEHVI proposes many non-optimal solutions, as seen by the bright region away from PF, which indicates a higher probability of occurrence.
Interestingly, in Figure 8d) for Concrete Slump problem, we observe that qNEHVI is consistently converging to a specific region in objective space, while the U-NSGA-III search follows that of Figure 8b) with concentration of solutions at the near-optimal region close to PF. We hypothesize that qNEHVI’s performance for this problem is influenced by how the underlying GP surrogate model learns the function and strongly biases solutions to that specific region. We show further proof in SI 2, where we illustrate the expected PF given by the GP surrogate model.
In contrast, both problems here indicated that U-NSGA-III benefited more from larger batch sizes, as seen by the green line, which is different from what we observed in Figure 6 for synthetic problems. Our hypothesis is that the modelled datasets present a more mathematically difficult optimisation problem, with various ‘obstacles’ that inhibit the evolution of solutions towards the PF. We support this by referring to our discussions for Figure 7 c) and d) on Concrete Slump regarding local optima, as well as observing a notable blank region of objective space which U-NSGA-III fails to flesh out in Figure 7 a) for Thin Film problem. Overall, results reported here suggest that given state-of-the-art implementations in HT experiments, a small batch-size with MOBO is the right strategy to converge rapidly.