The home page of www.graphworkflow.com offers a random palette of graphs for browsing content by type. This unstructured learning suits those who are here just to find out how to make one of these graphs. Just click on one of those graphs, read the approach and download the Stata code at the end of each page to learn how to make it.

For structured learning you must master the *Graph Workflow *model. Start by reading this page and then follow the links provided at the end of this page for more instructions.

## The model

All graphs featured in the www.graphworkflow.com homepage have been constructed by following the *Graph Workflow *model for data graphing as described in the below flowchart. This flowchart is the essence of this website:

The *Graph Workflow* model for data graphing is inspired by work in graph theory and data visualisation, and is informed by experimental and cognitive psychology on visual perception. The workflow is generalisable to any visualisation function (not only data graphs), but is presented here with a strict focus on constructing powerful statistical data graphs, sometimes also referred to as ‘data plots’ or ‘data charts’.

Although the *Graph Workflow* is a conceptual framework for how one should approach data graphing, it retains a strictly pragmatic implementation. It brings structure by reconciling theory and practical tools and can be applied in a systematic manner across a wide range of data graphing questions. Adhering to the *Graph Workflow* ensures consistency in constructing high quality data graphs that bring accurate, fast and confident visual decoding.

## Credits

The *Graph Workflow* model is inspired primarily by the work of Jacques Bertin on *Semiologie Graphique* (English translation *Semiology of Graphics*), which has served as the inspiration for much of data graph design work.

My notes are also heavily influenced by the lifetime work of John Tukey, William Cleveland, Howard Wainer, Leland Wilkinson, and Edward Tufte. Other notable work is cited throughout the material.

In terms of working with graphs in Stata, anything written by Nick Cox is definitely worth reading, which I have and learned a lot.

**Key nomenclature**

The terms *data visualisation* and *data graphing* are often used interchangeably. However, data visualisation is a broader class of visual information that subsumes data graphs, infographics, dashboards, storyboards and much more. Data graphing is the specific practice of producing a single statistical graph.

*Encoding* describes the process and choices made for displaying the data in a graph. *Decoding *describes the process of interpreting or perceiving the information that has been encoded in a data graph. Different encoding approaches lead to differences in decoding information.

## Data graph definition

The definition of data graph that is adopted in the *Graph Workflow model* abides to two key principles.

First, a data graph utilises information that is characterised by some degree of stochasticity. In other words, not everything is known about the way the data behaves thus graphing is employed as means of enhancing our understanding about the variation in the data. Data graphs aid description and exploration of the data, and provide inferential analysis and diagnosis (see end rule of the workflow). In other words, we need graphs to turn complex data into actionable information. Therefore, the definition of a data graph excludes deterministic visualisation functions, such as process diagrams, organisational flow-charts, structures and layouts. Also, the graphing of known mathematical forms, e.g. a logit function, is only considered when overlaid with data. That is, the graphing of a theoretical relation is not classified as a data graph itself given its deterministic form, but it may be useful for referencing its relevance to the data.

Second, every data graph begins by encoding data coordinates on a plane, thus the point visual implantation (i.e. a coordinate dot point) is considered as the embryonic step towards encoding any data as a data graph. The *Graph Workflow model* does not cover visualisation functions where data is encoded without any underlying coordinate relational meaning, such as semiotic representations of isotypes, or pictorial representations mingled with data in no particular coordinate structure as it is the case of infographic journalism. There is no restriction in the form of coordinate systems, and this website demonstrates examples of coordinate space in standard Cartesian form, polar planes, parallel coordinates and geographical coordinate planes.

Network visualisations, or what network mathematicians call ‘graph’, exist in abstract coordinate space where nodes and links can be reorganised independent of physical position. Network graphs meet the definition of a data graph but require special treatment and extreme care in encoding. In this respect, network visualisation tools such as those described in Manuel Lima’s *Visual Complexity* certainly qualify as data graphs for visualising properties of complex relational data. However, several of those examples would fail to pass the quality control that is proposed by the *Graph Workflow model*, because they place the quality of Design well above all other desired qualities of data graphs.

Cartographic maps with overlaid data are a special class of data graphs where longitude and latitude act as the coordinate system (of geographical location). Topographic maps that encode surface features such as terrain relief, vegetation, hydrography or man-made features are not considered data graphs unless there is an added overlay of data and the topographic detail is only used to support the context of the data. For example, John Snow’s seminal topographic map of London’s 1854 Soho cholera outbreak meets the definition of data graph because it accurately encodes the density of cholera cases by precise geographical coordinates.

Dashboards qualify as collections of interactive data graphs.

## Master the Graph Workflow

The beginning rule of the *Graph Workflow model* is the all-defining Graph Objective, informed by the Qualities of Data Graphics. Then, Data Management acts as the first step towards the implementation of the objective. Exploratory Data Analysis is instrumental in helping learn the properties of data and transforming data to a form that is suitable for visualisation. The encoding of every data graph begins with a choice from Visual Implantations, beginning with the point implantation, then followed by a choice of Retinal Variables for distinguishing different forms of data signals. Graph Identification is necessary for resolving uncertainty during the decoding stage. Graph Enhancement sharpens the focus to the graph objective. All encoding choices must abide by the Standards and Principles of visualization theory and graphical design as discussed throughout. Visual Decoding applies a user-perspective by taking into consideration the audience’s capacity to decode, and the strengths and limitations in Visual Perception. The workflow follows an iterative process until decoding is relevant, complete, unambiguous and efficient. The final output is applied for descriptive, exploratory or inferential analysis.

To master the *Graph Workflow* approach, follow the road map provided in LEARN THE WORKFLOW MODEL menu as shown in the top right-hand side corner of the www.graphworkflow.com homepage.

Back to About ⟵ ⟶ Continue to Information system