Establishing the statistical context of the graphical analysis is the most significant step in any data graph exercise. The statistical context guides towards the choice of an appropriate data graph estimator from a taxonomy of tools.
For example, Cleveland (1993, Visualizing Data) suggests that it is useful to think about the dimensions of variation when defining the graph objective, using by classifying context in terms of univariate, multivariate, hypervariate or multiway statistical variation.
Univariate questions suggest the use of quantile plots, probability normal plots, box-plots, kernel density estimation, histograms and bar charts if the data is categorical. Multivariate questions suggest the use of the scatter plot and smoothing estimators, and clever use of retinal variables. Hypervariate questions need more advanced encoding strategies, such as the use of small multiples and parallel coordinate plots. Multiway questions are relevant with categorical data and cross-classifications.
A taxonomy of context
It is useful to think about which types of dimensions are involved in the univariate, multivariate or hypervariate questions. for example, defining the graph objective as a temporal question (i.e. asking when?) requires the use of a time frequency variable. Or the graph objective can be a spatial question (i.e. asking where?), which may require the use of geographical or topographical maps.
Below I provide a taxonomy of context that has helped my conceptualise the specification of the graph objective and choose appropriate data graph tools:
- Distributional analysis of a single variable
- Correlation analysis between variables
- Comparative analysis of magnitudes and differences
- Compositional analysis of how components make up a total
- Relational connectivity analysis
- Temporal analysis
- Stock-and-flow analysis on how flows accumulate into stocks
- Spatial analysis
The proposed taxonomy is certainly not exhaustive. Often data graphs combine different contexts, e.g. an aggregation graph may have a temporal dimension.
One graph for every objective
Regardless, always remember that data graphing is not meant to be a story-telling odyssey. Data graphs must be focused in their objective and be concise in conveying a sharp message. If you want to tell many stories then make many graphs.