The projected response of the atmospheric circulation to changes driven by increasing greenhouse gas concentrations is highly uncertain. One of the primary reasons for this is that the state-of-the-art models we employ to investigate these responses struggle to represent basic features of the midlatitude circulation such as storm tracks, jets and blocking. Biases also have detrimental effects on predictive skill for dynamically driven fields at climate prediction time scales of seasons to decades. Despite this, physical understanding of the controls on these features and the drivers of their biases is still limited. Here we investigate a hierarchy of large ensemble climate reanalysis and hindcast simulations performed by the Norwegian Earth System Model (NorESM). Each ensemble is 30 members and was run from 1985-2010. For the reanalysis runs various data-assimilation strategies were employed. These are: SST only, SST plus hydrographic profiles, SST plus hydrographic profiles plus sea-ice concentration. The assimilation was performed monthly after which the model runs freely. These are compared to both free runs and AMIP-style simulations with ERA-Interim serving as ground truth. We evaluate the North Pacific and North Atlantic jets in winter and summer. We also identify where the observations lie within the predictive distribution of the ensemble. Results show that the North Atlantic jet is too zonal, extends too far into Europe and is shifted northwards. Virtually the entire North Atlantic sector lies outside the predictive distribution of the ensemble and performance actually degrades in the simulations with tighter constraints on the assimilation. By contrast the North Pacific jet is rather better represented in all aspects both with respect to pattern as well as magnitude of the biases. This is likely due to the better-represented teleconnections between the tropical and extratropical Pacific. Comparison of these ensembles with AMIP simulations suggests that the errors in the midlatitude circulation reside in the atmospheric component of the model. We also present results from hindcast simulations where NorCPM was initialized at different times of the year and then run forward 12 months. Implications and causes of the varying behavior among the ensembles are discussed as well as prospects for the future.