CARMA Notes: Data Accuracy
Although CARMA incorporates all known major public disclosure databases, the majority of the site’s data is necessarily estimated using statistical models. This will hopefully change in the future as governments and companies become more open about the source of global warming pollution, but for now estimates are unavoidable. So, how accurate are CARMA’s model estimates?
This question is addressed in detail in a technical paper describing the CARMA methodology and results. Here I want to share some of the main findings and highlight important caveats related to use of the data.
As detailed elsewhere, CARMA’s models are fit to a high-resolution dataset of U.S. plant performance. The models then predict the electricity production and CO2 emissions of plants outside the U.S. for which publicly disclosed data is not available. As part of the CARMA technical paper, an analysis was undertaken to estimate the likely accuracy of the model output. Overall, the models do a better job of predicting the carbon intensity of a given plant (kg CO2 per MWh) and have more difficulty accuracy predicting total electricity generation. For example, it is estimated that, for CO2-emitting plants with estimated values, CARMA reports CO2 intensity that is within 20% of the true value about 60% of the time. But for electricity generation, the reported value is within 20% of the true value only about 40% of the time.
Why is predicting the amount of electricity generated by a given plant in a given year so difficult? The short answer is that utilization of many plants jumps around from year to year (i.e. high inter-annual variability) for reasons that cannot be easily observed or modeled. For example, the CARMA technical paper analyzes how annual generation changed between 2009 and 2010 for ~5,000 U.S. power plants that showed no change in engineering characteristics. Nearly 50% of the plants saw annual generation change at least 20% between 2009 and 2010 and about 30% saw a change of at least 40%. Remember, the variables that CARMA’s models are able to use have not changed for these plants – but generation is still jumping around from year-to-year. This variability makes it fundamentally difficult to detect clear patterns or “rules” that the models can use to precisely predict performance when public data is not available.
When we consider these difficulties, the CARMA models are actually performing reasonably well. For example, an “ideal” model, given the range of variables available to CARMA and accounting for inter-annual variability, would likely predict annual generation to within 20% of the true value in about 55% of cases. The evidence suggests that the CARMA v3.0 models currently achieve that level of accuracy for slightly more than 40% of plants. And whereas an ideal model could be expected to be within 40% of the true value for about 70% of plants, CARMA does so in more than 60% of cases. Overall, that’s pretty decent model performance.
It’s also clear that accuracy depends on the type of power plant in question. In general, larger plants are easier to predict than smaller ones. And coal power plants – owing to their predominant and more consistent use as base-load electricity providers – exhibit greater model accuracy than other fuel types. Conversely, smaller and/or gas- or oil-based units are likely to see higher prediction errors. Hydroelectric power plants are a mixed bag since performance in any given year is highly dependent on local weather conditions that are not observed by CARMA’s models.
On the plus side, CARMA’s estimates can be fairly interpreted as reasonable long-term performance metrics. CARMA’s models show no evidence of systematic bias, so while estimates for any particular year may exhibit significant error, the long-term performance of most plants is likely consistent with the model predictions. This is especially true of larger plants. Measures of typical, long-term performance for larger facilities (existing and planned) are, perhaps, the most relevant information for many real-world applications of CARMA. In addition, prediction of CO2 intensity – an equally useful metric for many CARMA users – is shown to be quite feasible and exhibits relatively low error.