Establishing the statistical context of the graphical analysis is the most significant step in any data graph exercise. The statistical context guides the choice of an appropriate data graph estimator from a taxonomy of tools.
A taxonomy of context
Cleveland (1993, Visualizing Data) suggests that it is useful to think about the dimensions of variation when defining the graph objective, by classifying the statistical context in terms of univariate, multivariate, hypervariate or multiway statistical variation. Univariate questions suggest the use of quantile plots, distribution plots, box-plots, and bar charts if the data is categorical. Multivariate questions suggest the use of the scatter plot and smoothing estimators, and clever use of retinal variables. Hypervariate questions need more advanced encoding strategies, such as the use of small multiples and parallel coordinate plots. Multiway questions are relevant with categorical data and cross-classifications.
It is useful to think about which dimensions are involved in the univariate, multivariate or hypervariate questions. For example, defining the graph objective as a temporal question (i.e. asking when?) requires the use of a time frequency variable. Or, defining the graph objective as a spatial question (i.e. asking where?), may require the use of geographical or topographical maps.
Below, I provide a taxonomy of statistical context that can help conceptualise the specification of the graph objective and choose appropriate data graph tools:
- Distributional analysis of a single variable
- Correlation analysis between variables
- Comparative analysis of magnitudes and differences
- Compositional analysis of how components make up a total
- Relational connectivity analysis
- Temporal analysis
- Stock-and-flow analysis on how flows accumulate into stocks
- Spatial analysis
The proposed taxonomy is certainly not exhaustive. Often data graphs combine different contexts, e.g. an aggregation graph may have a temporal dimension as in the analysis of the US federal tax revenue.
One graph for every objective
Regardless, always remember that data graphing is not meant to be a story-telling odyssey. Data graphs must be focused in their objective and be concise in conveying a sharp message. If you want to tell many stories then make many graphs.