Understanding terrestrial ecosystems and their response to anthropogenic climate change requires quantification of land-atmosphere carbon exchange. However, top-down and bottom-up estimates of large-scale land-atmosphere fluxes, including the northern extratropical growing season net flux (GSNF), show significant discrepancies. We develop a data-driven metric for the GSNF using atmospheric carbon dioxide concentration observations collected during the High-Performance Instrumented Airborne Platform for Environmental Research (HIAPER) Pole-to-Pole Observations (HIPPO) and Atmospheric Tomography Mission (ATom) flight campaigns. This aircraft-derived metric is bias corrected using three independent atmospheric inversion systems. We estimate the northern extratropical GSNF to be 5.7 ± 0.2 Pg C and use it to evaluate net biosphere productivity from the Coupled Model Intercomparison Project phase 5 and 6 (CMIP5 and CMIP6) models. While the model-to-model spread in the GSNF has decreased in CMIP6 models relative to that of the CMIP5 models, there is still disagreement on the magnitude and timing of seasonal carbon uptake with most models underestimating the GSNF and overestimating the length of the growing season relative to the observations. We also use an emergent constraint approach to estimate annual northern extratropical gross primary productivity to be 56 ± 15 Pg C, heterotrophic respiration to be 25 ± 11 Pg C, and net primary productivity to be 28 ± 10 Pg C. The flux inferred from these aircraft observations provides an additional constraint on large-scale, gross fluxes in prognostic Earth system models that may ultimately improve our ability to accurately predict carbon-climate feedbacks.