We propose a framework to assess monthly GCM precipitation and temperature simulations with the aim of achieving robust annual and seasonal climatic projections. The approach is based on a Past Performance Index (PPI) inspired by the Kling-Gupta Efficiency (KGE) and accounts for climatological averages, interannual variability, seasonal cycle, monthly probabilistic distribution and spatial patterns of climatological means. The PPI formulation is flexible enough to include additional evaluation metrics and weight them differently, enabling the diagnostics and classification of GCMs in a simple diagram that shows the joint performance for precipitation and temperature. We demonstrate the utility of this approach to evaluate 27 CMIP6 models and constrain the spread of projections in five regions with very different climates across continental Chile. We also examine the degree of correspondence between the ensemble of models classified as ‘satisfactory’ based on the PPI and the capability of GCMs to reproduce teleconnection responses to El Niño Southern Oscillation and the Southern Annular Mode. The results show that the approach is useful to discriminate models that do not reproduce the seasonal precipitation cycle and to narrow the spread of projected annual and seasonal changes. The best models, according to the PPI, do not necessarily overlap with those that replicate historically observed teleconnections, suggesting that the latter criterion complements our GCM assessment framework. Finally, we show that model features that can be improved through bias correction can be excluded from the model evaluation process to avoid culling models that reproduce historically observed teleconnections.