A long-standing challenge in studying the global carbon cycle has been understanding the factors controlling inter–annual variation (IAV) of carbon fluxes related to vegetation photosynthesis and respiration, and improving their representations in existing biogeochemical models. Here, we compared an optimality-based mechanistic model and a semi-empirical light use efficiency model to understand how current models can be improved to simulate IAV of gross primary production (GPP). Both models simulated hourly GPP and were parameterized for (1) each site–year, (2) each site with an additional constraint on IAV (CostIAV), (3) each site, (4) each plant–functional type, and (5) globally. This was followed by forward runs using calibrated parameters, and model evaluations at different temporal scales across 198 eddy covariance sites. Both models performed better on hourly scale than annual scale for most sites. Specifically, the mechanistic model substantially improved when drought stress was explicitly included. Most of the variability in model performances was due to model types and parameterization strategies. The semi-empirical model produced statistically better hourly simulations than the mechanistic model, and site–year parameterization yielded better annual performance for both models. Annual model performance did not improve even when parameterized using CostIAV. Furthermore, both models underestimated the peaks of diurnal GPP in each site–year, suggesting that improving predictions of peaks could produce a comparatively better annual model performance. GPP of forests were better simulated than grassland or savanna sites by both models. Our findings reveal current model deficiencies in representing IAV of carbon fluxes and guide improvements in further model development.