Data properties

The question on data properties informs the appropriate selection of encoding tools, and can be addressed via a taxonomy in data variation. Jacques Bertin (1967) proposes three broad classes of variation, which he calls levels of organisation.

Knowing the data properties directly determines the choice of retinal variables.

Qualitative-level variation

Qualitative level variables have nominal categories that follow no particularly order, e.g. ethnicity, gender, industrial classes. For example, we cannot say that in gender, female is greater or bigger than male.

The visual encoding of qualitative-level variation must not give the illusion of any order and the categories should be displayed as equidistant, i.e. their decoding must bear the same perceived weight. The qualitative level describes data which must be perceived in terms of group association or group differentiation.

Binary classification is a special case of nominal variation that however can only be perceived in terms of differentiation, e.g. when visualising differences between ethnicities or between genders the encoding tools chosen should not favour the decoding of one category over the other.

Ordered level

Ordered level variables have discrete categories that follow a specific ranked order, from highest/largest to lowest/smallest, but have no fixed interval distance between the categories.

In the ordered level, the categories are ranked but we cannot say by which magnitude they differ. For example, we can say that an executive position is a higher level of management than a senior manager, a junior manager or a supervisor, but we cannot say that the difference in qualification or managerial competence is the same as we move up the hierarchy.

The same holds for survey likert scales where, for example, we cannot say that being agreeable is twice as large as being disagreeable. Such ordinal variation must be encoded to convey this sense of order but must also make it clear that there is not constant scale in the order.

Quantitative interval-ratio level

Quantitative interval-ratio level variables have continuous variation that follows a specific order and has a fixed ratio underlying the difference between measurements, e.g. income, weight, distance.

Quantitative interval-ratio level data describes continuous variation that is typically described in terms of density and differences in density. For example, income data in dollar precision is ordered from highest to lowest and its categories are classified by the continuous increment at dollar precision, thus we can say that $10,000 is ten times greater than $1,000 or half as large as $20,000.

Back to Statistical context ⟵ ⟶ Continue to Intended audience

Demetris Christodoulou