* Corey Dethier is a postdoctoral researcher at the Minnesota Center for Philosophy of Science. He is interested in climate change, data visualization, and scientific testimony. This post is based on an article that is now available (open access!) at Noûs. He writes…
In testimony before congress in 2015 and 2016, the contrarian climate scientist John Christy argued that the consensus view within the science— that the earth is warming and we’re responsible— is false.
Central to his argument was a series of graphs, such as this one:
There’s a lot going on here. For now, notice the gap between the big red line— the model predictions— and the blue and green dots, which represent observations. What Christy’s graph shows, allegedly, is that there’s a big gap between models of climate and empirical observations.
Christy’s testimony was highly controversial for a number of reasons. For one thing, the graph above focused on one layer of the atmosphere— the mid-troposphere— and the models were much better when predicting anything else. So Christy cherry-picked the worst dataset to focus on. There were also concerns about the data set and Christy’s treatment of it, as documented by Benjamin Santer and colleagues almost a decade earlier (see Santer et al 2008, 2009).
But the graph itself was also controversial. To see why, it’s helpful to look at another graph, created by climate scientist Gavin Schmidt using essentially the same data.
Here the shaded area is the confidence interval given by the models; the colored lines are Christy’s favored datasets. In Schmidt’s graph, unlike Christy’s, the two almost entirely overlap. There are only a couple points where the observations are outside of the confidence internal— and we’d expect that there would be a couple extreme years in an almost 40-year period, so that’s not too surprising.
How is this possible? How can we have two graphs created using the same data, but that appear to show entirely different things? Are these just different “perspectives” on the data? Is one of them right? (What does it mean to say that a graph is “right”?) If the other one is wrong, is it false, or just misleading?
These are the kinds of questions that philosophers are interested in. Traditionally, however, philosophers have paid very little attention to graphs when asking these kinds of questions. (Some exceptions include Perini 2005, Kulvicki 2010, Irving 2011, and forthcoming paper by myself.) I think this is too bad, because there’s valuable work that philosophers can do here, including by bringing clarity to this particular debate.
The content of a graph
To display information graphically, we need to do two things. First, we need to turn the relevant information into a mathematical object or set of mathematical objects— objects like coordinate pairs, lines, or regions. Second, we need to fix a “frame” that allows us to see the original message in the mathematical object. This is all easier with an example. Here’s a graph I created using data from the National Center for Environmental Information (NCEI):
This graph displays surface temperatures relative to the 20th century average between May 1850 and April 2024. We can think of the process of creating this graph in the following way. I started with a list of sentences:
Between May 1850 and April 1851, the mean global temperature was 0.16 degrees Celsius below the 20th Century average.
…
Between May 2023 and April 2024, the mean global temperature was 1.29 degrees Celsius above the 20th Century average.
Then I turned this list of sentences into coordinate pairs:
(1851, -0.16)
…
(2024, 1.29)
Finally, I choose an x and y axis and plotted all of the coordinate pairs using columns to indicate the location of each.
The result is a kind of picture that allows you to (quite literally) see the original facts just by looking at it— that’s what’s great about graphs. And this basic example suggests a principle, also essentially endorsed by Laura Perini (2005), for understanding what counts as the content or “meaning” of a graph: in simple cases like this one, at least, the meaning of the graph is just the same as the meaning of the list of sentences that I started with.
This might seem obvious, but it’s quite a fruitful observation. It allows us to say, for example, that the following graph is simply false:
Why? Because it has the same meaning or content as a list of sentences, one of which says that the mean global temperature between May 2023 and April 2024 was 0.20 degrees Celsius below the 20th century average. Since that sentence is false, and (part of) our graph has the same content as it, the graph is false too.
But there are other ways that a graph can go wrong. Here’s an example:
If you look carefully, you’ll see that all of the data are correctly plotted— if you translated this graph into a set of sentences, you’d get only true sentences. So this graph isn’t false. But it’s still misleading: it makes it look like temperatures are falling when they’re really going up. (This example is based on an infamous graph of gun deaths in Florida.) So a graph can go wrong if it “frames” the information in the wrong kind of way.
Now it turns out that the difference between the Christy and Schmidt graphs is (mostly) a difference about frames. Let’s go back to our original example to see this.
The debate over “baselines”
We need just a little bit of background on climate science to understand the difference between Christy and Schmidt. The key fact to know is that climate scientists measure temperatures in relative terms, by way of what are called anomalies.
There are two main reasons for this. First, a hot summer in Minneapolis is not the same as a hot summer in Phoenix. If we want to know whether it’s generally getting hotter across the U.S. (let alone the world) we need to correct for these differences. So rather than measuring in absolute terms, we measure relative to local averages. The second reason is that the main predictions generated by climate models are about these same anomalies.
This fact is important for the following reason. If you want to graph temperature anomalies, you can’t just use absolute temperatures as your y-axis and throw the data up there— after all, each set of data is given in anomalies, not absolute temperatures. So if we’re going to graph either set of data, we need the y-axis to be relative temperatures, and if we’re going to graph both on the same y-axis, we need both sets of temperatures to be relative to a shared reference point, called a “baseline.”
Here’s a very simple example. One way we could determine the shared baseline is by taking the average anomaly in both the models and observations and setting both averages to 0. Here’s a very simple graph of temperature trends in the mid-troposphere using that baseline:
We don’t have to do things that way, of course. Maybe we instead want to set the baseline by using the first five or last five years in the data set. The results are the following graphs:
Notice something about these last two graphs— they seem to say very different things! The graph on the left appears to say that the models used to be very accurate but they’re getting much worse. The one on the right appears to say that the models used to be very inaccurate but they’re getting much better.
These are illusions. Or, more precisely, they’re artefacts. These data don’t tell us whether the models are getting better or worse. The lines on these graphs are just trend lines; what we’re seeing are the differences in the rate of change of the model predictions as opposed to the observations. The same is true of both Christy’s and Schmidt’s graphs: we can learn something about the relationship between the models and observations by looking at differences in the patterns, but the gap or space between the two lines at any particular point is a function of our choices about what baseline to use, not the data itself.
The main (see below) reason why Christy’s graph seems to show a large gap while Schmidt’s shows a dramatic overlap is this kind of visual artefact. And this is also the heart of the critique that Schmidt and others level at Christy: when Christy presents his graph in support of the claim that the models are wrong, the graph appears to support that claim much better than it actually does due largely to artefacts— artefacts that Christy steers into rather than away from.
In philosophical terminology, in others words, Schmidt’s complaint is about the pragmatics of Christy’s graph for than the semantics; it’s about what the graph communicates rather than what it (literally) means.
Of course, Christy and his supporters have their own arguments. As they point out, using a baseline close to the middle of the time series can give the (false!) impression that the errors cancel out. So if you look back up at my graph again, it looks like half of the errors involve the models being too cold, and have involve them being too hot. On the whole, then, they’re about right. This is just as much of artefact as the appearance of convergence or divergence.
Editorializing for a moment, I don’t find this latter argument as compelling as Schmidt’s. For one thing, Schmidt and others don’t rely on the appearance of canceling out in the same way that Christy relies on the appearance of divergence. For another, if your goal is to avoid the appearance of errors canceling out, the graph that appears to show convergence is just as good as the one that appears to show divergence. The fact that contrarians consistently use the latter rather the former suggests that the misleading aspects of the graph are to a large extent the point.
The mislabeled axis
As I said, the phenomenon identified above is the main source of the divergence between the graphs we started with. But it’s not the whole story— it can’t be, because Christy doesn’t actually use a baseline, at least not in the traditional sense. Let’s look at his graph again:
Notice the box on the left side of the graph, which says “The linear trend (based on 1979–2015 only) of all time series intersects at zero in 1979.” To me, this looks like an empirical claim. It’s not. It’s a methodological stipulation.
Essentially, Christy’s approach is the following. He finds the slopes of relevant lines— for the models, the average thereof, the observations, etc.— starts the virtual analogue of his pen at (1979,0), and then draws the lines starting from there. Or, more simply: he finds the line of best fit and then shifts his data so that ythe line intersects the x-axis in 1979.
Schmidt notices this, commenting “To my knowledge this is a unique technique and I’m not even clear on how one should label the y-axis.” As the title of this subsection indicates, I want to go further: I think the y-axis is mislabeled and the graph is thus false.
Recall the quick sketch of graphical content that I gave above, according to which the content of a graph is just the same as the content of set of sentences. We can find the relevant set of sentences by “de-coding” the graph, working backwards from the graph to a description of the world according to the frame that’s provided by the axes, labels, legend, etc. Here’s a very simple example that should make the point clear:
In this graph, we have three points plotted: (1979,0), (1980,.239), and (1981,.092). Our baseline is given by the temperature recorded in 1979— the y-axis tells us how far a temperature recording departs from the temperature recorded in that year. So: the point at (1980,.239) means that in the year 1980, there was a .239°C departure from the temperature in 1979. The same is true with other baselines. If our baseline were average of the period 1979–2023, placing a point at (1980,.239) would mean that in 1980 there was .239°C departure from the average temperature in the period 1979–2023.
The same is not true with Christy’s method. If we adopt his approach, placing a point at (1980,.239) means that in 1980 there was a .239°C departure from what is essentially an arbitrary constant— namely, the value taken on by the trend line in 1979. Importantly, the value taken on by the trend line in 1979 is not the same as the temperature actually measured in 1979; the two have only a tenuous relationship. In effect, adopting Christy’s method leads us to a graph that depicts what philosophers would call “counterfactual” temperatures: when we use this method, placing a point at (1980,.239) means: if in 1979 we had observed the temperature taken on by the trend line in 1979, then in 1980 we would have observed a .239°C departure from the temperature in 1979.
Now, while that’s what Christy’s method yields, that’s not what his graph says. Christy gives no indication that what’s pictured is essentially a fictional universe in which the recorded temperatures are different from the temperatures that were actually recorded. Instead, the axes of the graph are labeled in temperatures and years; any normal reader will naturally interpret those as actual temperatures and actual years rather than fictional or counterfactual ones. So when we “de-code” Christy’s graph according to the axes as he labeled them, we end up with false claims about the relationship between actual years and actual temperatures. Hence my claim: the axes are simply mislabeled, and (as such) the graph is simply false.
Conclusion
Ultimately, the point I’m making in this last section is a fairly technical one. Christy’s graph is false, but it’s false for largely pedantic reasons; there isn’t really that much difference to be seen in the resulting graph between the approach Christy actually uses and a baseline of (say) 1979. The point made by Schmidt is much more important: even if is Christy’s graph were to have accurate content, it would still be seriously misleading.
Nevertheless, the result is interesting. For one thing, we’ve seen that Schmidt’s offhand confusion about axis labels was on-point: Christy’s axis really is mislabeled, because his plots aren’t really temperatures, or at least not really temperatures recorded in the real world. And we’ve also seen something else, namely how just a little bit of philosophy can go a long way in helping us understand graphs and how they work.
References
Christy, J. 2015. Testimony before Congress. url: https://www.commerce.senate.gov/2015/12/data-or-dogma-promoting-open-inquiry-in-the-debate-over-the-magnitude-of-human-impact-on-earth-s-climate
Dethier, C. In press. “How do you Assert a Graph? Towards an account of depictions in scientific testimony.” Noûs. https://onlinelibrary.wiley.com/doi/10.1111/nous.12529
Irving, Z. 2011. “Style, but Substance: An Epistemology of Visual versus Numerical Representation in Scientific Practice.” Philosophy of Science. doi: 10.1086/662567
Kulvicki, J. 2010. “Knowing with Images: Medium and Message.” Philosophy of Science. doi: 10.1086/651321
Perini, L. 2005. “The Truth in Pictures.” Philosophy of Science. doi: 10.1086/426852
Santer, B. et al. 2008. “Consistency of Modeled and Observed Temperature Trends in the Tropical Troposphere.” International Journal of Climatology. doi: 10.1002/joc.1756
Santer, B. et al. 2009. “Incorporating Model Quality Information in Climate Change Detection and Attribution Studies.” Proceedings of the National Academy of Sciences. doi: 10.1073/pnas.0901736106
Schmidt, G. 2014. “Absolute Temperatures and Relative Anomalies.” url: https://www.realclimate.org/index.php/archives/2014/12/absolute-temperatures-and-relative-anomalies/
Schmidt, G. 2016. “Comparing Models to the Satellite Datasets.” url: https://www.realclimate.org/index.php/archives/2016/05/comparing-models-to-the-satellite-datasets/ (visited on 04/02/2023)
For more recent Extinct content on climate change:
Federica Bocchi: “Sizing up the Biodiversity Crisis”
Aja Watkins: “Is Contemporary Climate Change Really Unprecedented?”
Joe Wilson: “Stable Isotopes in Unstable Times: Harold Urey’s Paleothermometer and the Nature of Proxy Measurement”