Evaluating the influence of anthropogenic emissions changes on air quality requires accounting for the influence of meteorological variability. Statistical methods such as multiple linear regression (MLR) models with basic meteorological variables are often used to remove meteorological variability and estimate trends in measured pollutant concentrations attributable to emissions changes. However, the ability of these widely-used statistical approaches to correct for meteorological variability remains unknown, limiting their usefulness in the real-world policy evaluations. Here, we quantify the performance of MLR and other quantitative methods using two scenarios simulated by a chemical transport model, GEOS-Chem, as a synthetic dataset. Focusing on the impacts of anthropogenic emissions changes in the US (2011 to 2017) and China (2013 to 2017) on PM2.5 and O3, we show that widely-used regression methods do not perform well in correcting for meteorological variability and identifying long-term trends in ambient pollution related to changes in emissions. The estimation errors, characterized as the differences between meteorology-corrected trends and emission-driven trends under constant meteorology scenarios, can be reduced by 30%-42% using a random forest model that incorporates both local and regional scale meteorological features. We further design a correction method based on GEOS-Chem simulations with constant emission input and quantify the degree to which emissions and meteorological influences are inseparable, due to their process-based interactions. We conclude by providing recommendations for evaluating the effectiveness of emissions reduction policies using statistical approaches.