External identification requires the following information:
- Grant title describing the graph objective, and optionally an additional subtitle for describing the data context.
- Caption acknowledging the data source, any data exclusions, and how to decode unfamiliar aspects of the graph.
- Axes titles describing the variable and its unit of measurement.
- Axes labels describing the scale, the minimum and maximum, plus any threshold values and baselines
- Added text summarising key results.
In fact, graphs without external identification do not meet the American Statistical Association for Graphical Presentation:
ASA standard: The title of a diagram should be made as clear and complete as possible. Sub-titles or descriptions should be added if necessary to insure clearness
ASA standard: Display axes titles to explain the form of measurement and the unit of measurement.
Lastly, it is absolutely critical to avoid loaded narratives.
The audience naturally seeks explanation from the graph’s supportive identification. If identification presents ‘loaded’ words, with priming or framing statements, then decoding will be surely be biased.
Here is a good example of loaded narrative:
The graph is taken from Messerli (2012) “Chocolate Consumption, Cognitive Function, and Nobel Laureates”, which claims that “Chocolate consumption enhances cognitive function, which is a sine qua non for winning the Nobel Prize, and it closely correlates with the number of Nobel laureates in each country ” (p.3).
I do not plan to delve into the validity of this argument and how the author has never heard before of the concept of confounding variables, nor to the fact that the graph is incomplete and cherry-picks its data (where are all the other countries that consume chocolate, and there are many more countries with Nobel prizes not shown here). The point I want to discuss here is that this graph misleads in its identification for two reasons.
First, the note to the graphs primes the reader to look for correlation even when it is not there. The author should have left this matter to the viewer to decide whether one can see correlation or not.
Second, the graph reports a statistical test for the Pearson correlation coefficient with a probability value less than 0.0001. This reinforces the misconception that this is a strongly linear relation. It is obviously not.