Seasonal strength of terrestrial net ecosystem CO2 exchange from North America is underestimated in global inverse modeling

We evaluate terrestrial net ecosystem-atmosphere exchange (NEE) of CO2 from nine global inversion systems that inferred ﬂuxes from four CO2 observational sources. We use 98 ﬂights in the central and eastern U


Introduction
Accurate, spatially-and temporally-resolved carbon flux estimation is essential for improving climate projections and informing carbon management and policy (e.g., Arora et al., 2020;Millar et al., 2017).A thorough knowledge of the biological CO 2 fluxes from a variety of ecosystems across different geographic locations facilitates total carbon flux estimation and the establishment of national and state implementation plans (e.g., Pan et al., 2011;Tan et al., 2015;J. B. Miller et al., 2020;Wang et al., 2020)(California's Natural and Working Lands (NWL) Implementation Plan:https://ww2.arb.ca.gov/our -work/programs/natural-and-working-lands) .Ecosystem carbon-stock inventories and terrestrial biogeochemical models are commonly used to provide biospheric carbon fluxes for policy planning (e.g., Tan et al., 2015)(California's NWL Inventory:https:// ww2.arb.ca.gov/nwl-inventory).Atmospheric inversion of CO 2 mole fraction observations to estimate biospheric CO 2 fluxes is an important and complementary avenue for independent evaluation of ecosystem carbon flux estimates (e.g., Ciais et al., 2010;Chevallier, 2021).These methods have benefited from the expansion of long-term atmospheric observing systems including both ground-based, airborne and space-based platforms (Crisp et al., 2008;Andrews et al., 2014;Sweeney et al., 2015;Karion et al., 2020).
An atmospheric inversion of CO 2 mole fractions optimizes CO 2 fluxes in such a way that simulated atmospheric CO 2 mole fractions agree better with observations (e.g., Rayner et al., 2019).Gridded global CO 2 fluxes are available from several multi-year atmospheric inversions, many of which are frequently updated to quantify CO 2 surface fluxes (e.g., CarbonTracker, https://www.esrl.noaa.gov/gmd/ccgg/carbontracker/or the Copernicus Atmosphere Monitoring Service, https://ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-global-greenhouse-gas-inversion).The inversion models use prior CO 2 flux estimates of different source components, including fossil fuel, biosphere, fire, and ocean.In general, most global inversion systems optimize the magnitude of land biospheric and oceanic CO 2 flux terms while leaving fossil fuel emissions "fixed" to derive the optimal solution.These CO 2 flux inversions estimate fluxes across the globe with a variety of spatial resolutions.Accurate regional flux information has the potential to inform policy planning and carbon management.To date, regional flux estimates within global inversions have shown large differences (Peylin et al., 2013;Crowell et al., 2019).Rigorous evaluation of current CO 2 flux inversion products in time and space is needed to improve atmospheric inversions to the point of being a sound, verified source of information to be used in regional carbon accounting.
Aircraft field campaigns are well-suited for regional flux evaluation.Aircraft field campaigns have been deployed in many different regions to investigate CO 2 NEE surface fluxes, including the CO 2 Budget and Rectification Airborne study over temperate North America (COBRA) (Gerbig et al., 2003), the Arctic-Boreal Vulnerability Experiment in boreal North America (ABoVE) (C.E. Miller et al., 2019), and the Atmospheric Carbon and Transport-America Earth Venture Suborbital mission (ACT-America) (Davis et al., 2021).Several studies have been conducted to evaluate the global CO 2 flux inversions using independent aircraft CO 2 measurements above the atmospheric boundary layer (ABL) and focus on a large domain, such as global or continental scale (Liu & Bowman, 2016;Chevallier et al., 2019;Gaubert et al., 2019;Liu et al., 2021).To date, few studies have been conducted to evaluate the seasonal and sub-continental estimates of the global CO 2 flux inversions.ACT-America is the largest carbon-centric aircraft mission conducted in any midlatitude, continental environment.The multi-seasonal ACT-America campaigns were held in the central and eastern United States (U.S.) during Summer 2016, Winter 2017, Fall 2017, Spring 2018, and Summer 2019(Davis et al., 2021;Wei et al., 2021).Over 1140 flight hours of data, roughly 45% of which were within ABL, were collected over the course of 121 research flights distributed across the central and eastern United States.The ACT-America flights sampled CO 2 mole fractions from the ABL to the upper free troposphere and were oriented to capture synoptic weather passages typical of each season and region (Pal et al., 2020;Wei et al., 2021).This multiseasonal weather-oriented aircraft campaign provides a unique opportunity to assess inverse estimates of regional CO 2 NEE.
Global CO 2 flux inversions can be based on ground-based CO 2 monitoring or satellitebased retrievals of the total column CO 2 (XCO 2 ) mole fractions.These observing systems provide complementary temporal and spatial representativeness.The Orbiting Carbon Observatory-2 (OCO-2) satellite was launched in July 2014 and was designed to quantify sources and sinks of CO 2 across the globe (Eldering et al., 2017).The OCO-2 v9 model intercomparison project (MIP) (Peiro et al., 2021) produced a suite of multiyear (2015-2019) gridded global CO 2 flux inversion products, including the NEE of CO 2 .The OCO-2 v9 MIP includes 10 global CO 2 data assimilation systems and is designed to assimilate both CO 2 in-situ data and the OCO-2 v9 column CO 2 data individually or collectively.We take advantage of the large spatial coverage and multi-seasonal sampling of ACT-America to evaluate the OCO-2 v9 MIP CO 2 NEE of temperate North America by comparing observed ABL CO 2 mole fractions to the corresponding simulated CO 2 mole fractions using the series of OCO-2 v9 MIP CO 2 flux inversion products.We apply two evaluation metrics to quantify the errors in CO 2 NEE from commonly-used global CO 2 inversion systems (applied in the OCO-2 v9 MIP) with respect to the independent airborne observations at sub-continental and ecoregional scales.The results are presented in Section 3, after the description of our data and methods in Section 2. The discussions and conclusion are shown in Section 4.

CO 2 NEE flux inversion products
OCO-2 v9 MIP released a suite of ten gridded CO 2 flux inversion products at the global scale encompassing the years 2015-2018.The different inversion systems are standardized in the sense that they are required to assimilate the same four sets of atmospheric observations and use the same fossil fuel CO 2 emissions as part of the inversion system inputs.The ten global CO 2 data assimilation systems are described by Peiro et al. (2021); Zhang et al. (2021) and some additional information are given in Text S1.The four observational data sources include the CO 2 mole fraction measurements from 1) in situ data (IS) compiled in the GLOBALVIEW+ 5.0 (Cooperative Global Atmospheric Data Integration Project, 2019) and NRT v5. 1 (CarbonTracker Team, 2019) ObsPack products; 2) the land nadir/land glint (LNLG) retrievals of column-integrated CO 2 from OCO-2 v9; 3) OCO-2 ocean glint (OG) v9 retrievals; and 4) a combination of the in situ and satellite data (LNLGOGIS).The suite of multiyear gridded CO 2 flux inversions are the monthly averaged products (https://gml.noaa.gov/ccgg/OCO2v9mip/).In this study, ancillary gridded global CO 2 NEE products at 3-hourly resolution from nine members of OCO-2 v9 MIP (Text S1) was created for the four ACT-America Campaign periods (summer 2016, winter 2017, fall 2017, and spring 2018).All models in OCO 2 v9 MIP were required to use the same fossil fuel inventory from the Open-source Data Inventory for Anthropogenic CO 2 (ODIAC) 2018 version but were not limited in their choice of biospheric, oceanic and fire prior fluxes.The prior flux inputs for the components of the biospheric, oceanic, and fire sources are listed in Table S2.Overall, there are 7 different prior NEE of CO 2 estimates used in these inversion systems, 6 different prior estimates of the oceanic CO 2 fluxes, and 4 different prior fire CO 2 emissions estimates.

Influence functions
We established the source-receptor relationship between CO 2 NEE fluxes and atmospheric CO 2 enhancement/depletion along flight tracks using the Lagrangian particle dispersion modeling technique (e.g., Cui et al., 2021).In the study, we aggregated the ACT-America CO 2 measurements in the ABL, excluding take-off and landing portions, to the 10-minute intervals to match the spatial resolution of the transport simulations in the global inversion systems.The ABL determination is described in Pal et al. (2020) and Davis et al. (2021).Each of the 10-minute (roughly 60-70 km at typical flight speeds) intervals is treated as a receptor and we release 1000 particles per receptor and simulate their backward transports for 10 days using FLEXPART v10.4 ("FLEXible PARTicle dispersion model") (Pisso et al., 2019).The FLEXPART model was driven by the ERA-interim reanalysis data (0.75 x 0.75 degree, 6-hourly).

Background values
To determine the background values, we sampled the CO 2 mole fraction field at the locations in time and space when and where the particle trajectories' 10-day backward simulations terminated.The CO 2 mole fraction fields are from the long-term forward simulation from each OCO-2 v9 MIP model within the optimized fluxes from each experiment.The total number of the CO 2 mole fraction fields used here are 35 (9 models and 4 experiments, and the CSU model did not implement the LNLGOGIS experiment).
Specifically, we use the option of FLEXPART to output the spatially and temporally resolved sensitivity field (dimensionless and the range is from 0 to 1) of each receptor used in the study to the initial conditions, interface with the CO 2 mole fraction fields when and where particles are terminated to determine the background value for each receptor (Text S1 and Figure S1).

Evaluation metrics
We convolve each CO 2 NEE flux product to the atmospheric mole fractions along the ACT-America ABL flight tracks and compare them with the enhancement or depletion levels of the NEE-related CO 2 mole fractions within the ABL observed by ACT-America.The enhancement/depletion levels of the CO 2 mole fractions sampled by ACT-America flights are total CO 2 influenced by different CO 2 sources.The influence of biological sources dominates the aircraft data because the flights were designed to fly over the ecosystems in the Central and Eastern US.We obtain the enhancement/depletion levels of the NEE-related CO 2 mole fractions along flights after extracting the portions influenced by the fossil fuels, fire and ocean from the total CO 2 measurements, as well as the determined regional background values described in Section 2.3 (Cui et al., 2021).
The influences of fossil fuels, fire and ocean are calculated using the influence function to convolve their surface fluxes within the 10-day span.We use the fossil fuel CO 2 emission estimates from the ODIAC 2018 emission inventory, and fire emissions from the GFED v4.1s wildfire emission inventory for all cases.The ocean CO 2 influence is derived from the convolution of the influence function and the monthly-averaged posterior oceanic CO 2 flux estimates from each experiment of the individual model in OCO-2 v9 MIP.In the study, we only used the boundary-layer CO 2 mole fractions of the ACT-America flights in the evaluation.Numerical estimates in Cui et al. (2021) show that the fire and ocean fluxes have very small contributions to the ABL mole fractions.Fossil fuel sources have a more significant, but moderate impact.Cui et al. (2021) used the root-mean-square error (RMSE) metric (equation 2) to evaluate inversion products of the CarbonTracker model, one of OCO-2 v9 MIP ensemble members, based on the comparisons between the simulated and ACT-America referenced NEE-related CO 2 mole fractions.In this study, we apply the RMSE metric to nine models of OCO-2 v9 MIP.Furthermore, we focus more on the mean bias error (MBE) metric analysis (equation 3) in the CO 2 mole fraction space to investigate the bias error of each inversion case in OCO-2 v9 MIP.
, where i denotes each receptor, and N denotes the number of receptors.More details of y modbioi and y ACT bioi are described in Cui et al. (2021)

Ecoregion-based evaluation framework
To evaluate fluxes by ecoregion, we group the receptors by ecoregion and calculate the MBE values between the simulated and observed biological CO 2 mole fractions for each group.The ecoregion-based MBE analysis are subsets of the overall MBE analysis.We present the "zoom-in" maps to investigate the spatial origins of the MBE values and show the maximum MBE value for each ecoregion associated with the corresponding inversion case.
We attribute the receptors along the flight tracks to different ecoregions, taking advantage of the source-receptor relationship obtained from the Lagragian framework.Specif- The open circles denote the IS experiments, and the solid circles denote the LNLG experiments.
The TM5 group (CT, OU, and TM5-4DVAR) is colored in red, the GEOS-Chem group (Ames, CMS-Flux, UT, and CSU) is colored in blue, the Baker model is in black, and the CAMS model is in yellow.The pink lines are linear regressions of all cases for each season.
ically, we attribute each receptor to one eco-region which contributes the largest influence function for that receptor (Text S1 and Figure S2).We group the segments of CO 2 mole fractions along the flight tracks into different ecoregions and apply the MBE analysis for each group to investigate the associated seasonal bias levels aligned with the ecoregion regions of the temperate North America area.The overall spatial coverages of the influence functions of ACT-America are shown in Cui et al. (2021).We focus on region 1-9 in this study, which contribute largest influence on the enhancement/depletion of CO 2 mole fractions along ACT-America ABL flight tracks.The inversion products from each model are only required to use the same fossil fuel emission and the same observational datasets, leaving many potential differences among the inversion systems including prior fluxes, transport, and inversion algorithms.Therefore, some of the performance differences of the inversion systems is caused by the differences of these model framework components, enabling limited diagnosis of the causes of the MBEs.Overall, the TM5-4DVAR model has the best performance across the different seasons.The TM5 group shows the best performance among the transport models, with smaller MBEs than the other transport models across four seasons.The OCO-2 v9 land nadir/land glint experiment yields the MBE level that is similar to, or better (e.g winter) than, the in situ data experiment.We have used one transport model to create the influence functions used to link NEE of CO 2 to ABL CO 2 mole fractions (see Section 2), thus we compare all of the systems on an equivalent basis.It is possible, however, that a bias in our influence functions contributes to the MBE in Figure 1, and yields incorrect rankings among these inversions.
In summary, we find the NEE of CO 2 in central and eastern North America by nearly all these inversion systems to be positively biased in summer and negatively biased in the other three seasons, with the degree of bias varying across the inversion system.Therefore, the magnitude of the seasonal cycle of NEE of CO 2 across central and eastern Temperate North America is likely to be underestimated across the models in the OCO-2 v9 MIP.The overall annual bias from these systems is not clear, since the seasonal flux biases change sign and will cancel out over the course of a year to a degree that is not clear from this analysis.
A number of broad patterns emerge when the MBE is evaluated for each ecoregion During the summertime, we identify large positive biases in Appalachian forests (ecoregion 5), central crops and forests (ecoregion 6), the corn belt (ecoregion 7), and northern crops (ecoregion 8).The UT and Baker-mean models contain many of the peak positive biases across these ecoregions.The TM5-4DVAR model shows the smallest MBE IS 2.9 3.9 -2.The impact of this apparent underestimate in the seasonal cycle of fluxes on annually integrated NEE of CO 2 of North America is not clear but deserves additional investigation.It is also possible that this seasonal bias could directly impact or is indicative of features of these inversions that could impact NEE estimates in other regions of the globe.The finding that the TM5-based inversions appear on average to have smaller seasonal biases than the GEOS-Chem-based inversions is also potentially consistent with the findings of Schuh et al. (2019).Schuh et al. (2019) suggested that TM5 mixes more vigorously in the vertical than GEOS-Chem.This could lead to TM5-based inversions requiring stronger NEE of CO 2 to match ABL CO 2 observations, since seasonal fluxes would be diluted within a larger atmospheric mixing volume.Schuh et al. (2019) showed that, globally, these differences in atmospheric mixing led to large differences in inverse estimates of annual NEE of CO 2 .We suggest that continued understanding of the causes of the biases at sub-continental scales found in this study will enable increased confidence not just in regional, seasonal NEE, but in global, annual NEE estimates.
The background CO 2 mole fractions (C bkg ) for each receptor are determined by combining the sensitivity of each receptor to the initial condition (m, prior to the backward 10 days) and the OCO-2 v9 MIP global optimized CO 2 mole fraction fields (C CO 2 ) (Equation S1).An example of the background determination is shown in Figure S1.
where m is the spatially and temporally resolved sensitivity field of the receptors to the initial conditions (dimension: n × i × j × z, n denotes the receptors, i, j, and z are latitude, longitude, and altitude, respectively), C CO 2 is the corresponding inversion-optimized CO 2 mole fraction fields (dimension: i × j × z), and C bkg is the determined background values for each receptor (dimension: n × 1).
We define 12 ecoregions in the study (Figure S2) and calculated the influence functions with these ecoregions.For each receptor, the ecoregion associated with the largest contribution to the influence functions is tagged as the representative region (Figure S2).
August 18, 2021, 3:16am : X -3 We calculate the seasonal NEE flux budget (PgC/yr) for the shaded areas (Figure S3) and analyze the relationships between the regional flux strength and the estimates of Mean Bias Error based on the ACT-America aircraft campaigns.
The maps of averaged CO 2 NEE during the ACT-America campaign months from the inversion products are shown in Figure S4, Figure S5,Figure S6, and Figure S7.
As mentioned in the main text, a suite of gridded global CO 2 NEE products at 3-hourly resolution from nine members of OCO-2 v9 MIP (Table S1) was created for the four  S2.

Figure 1
Figure1shows seasonal Mean Bias Error (MBE) levels to the seasonal NEE estimation of OCO-2 v9 MIP members.We focus here on the flux estimates from the insitu ("IS") and the OCO-2 v9 land nadir/land glint ("LNLG") experiments, whichCui et al., (2021) suggests are the most reliable NEE estimates for the central and eastern US.We find correlations between OCO-2 v9 MIP seasonal NEE estimates and seasonal MBE.The corresponding correlation coefficient (p-value) to the four campaigns are 0.4 (p=0.15),0.7 (p=0.001),0.6 (p=0.009), and 0.5 (p=0.02),respectively.The correlations are statistically significant for the winter, fall and spring months.Figure 1 shows that posterior estimates of NEE of CO 2 are underestimated in the IS and LNLG experiments compared to observations during winter, fall, and spring.Posterior estimates of NEE of CO 2 are overestimated (not sufficiently negative) during the summer.The TM5-4DVAR and OU models have the best performance during winter and fall seasons.The TM5-4DVAR and CT model within the LNLG experiment have the best performance during the summer.

(
Figure 2).In all seasons the patterns of ecoregion MBEs change relatively little as a function of the data source used in the inversion.Summer and fall have the largest overall MBEs.The large MBEs are located in the Appalachian forests (ecoregion 5), central crops and forest (ecoregion 6), the corn belt (ecoregion 7), and the northern crops (ecoregion 8).More pronounced MBE levels in the positive and negative direction are found in the Baker and UT models, which may imply a smaller model-data-mismatch covariance given in the model than others.The OU model MBE most often diverges in sign from the other models during the dormant season, and the Ames and CMS-Flux models often have the largest negative MBEs in the dormant seasons, especially when limiting the discussion to the IS and LNLG inversions.

Figure 2 .Figure 3 .
Figure 2. Mean Bias Error (MBE, ppm) for 9 different ecoregions in Central and Eastern Temperate North America.The largest magnitude of MBE for each ecoregion is written onto the cell.A warm color denotes a positive bias, and a cold color denotes a negative bias.The ecoregions are defined in Figure 2.Shaded areas denote no data.

Figure S1 .
Figure S1.The upper panel shows the background CO2 mole fractions for each receptor

Figure S2 .:Figure S3 .Figure S4 .:Figure S5 .Figure S6 .:Figure S7 .
Figure S2.The left panel displays the spatial patterns of ecoregions in Temperate North ACT-America Campaign periods(summer 2016, winter 2017, fall 2017, and spring 2018).The global atmospheric CO 2 inversion models are driven by different prior flux components including Fossil fuel, NEE, fire and ocean fluxes.All models used the same fossil fuel flux products from ODIAC 2018 version (https://gmao.gsfc.nasa.gov/gmaoftp/ sourish/ODIAC/2018/distrib/), and the prior flux from NEE, fire, and ocean components are listed in Table

Table S1 .
Basic information of the nine global inversion systems evaluated in the study. :