Lecture 7 | Theory of Data Graphics II

Max Pellert

IS 616: Large Scale Data Analysis and Visualization

The first chart devotes too much of their ink to graphical apparatus, with elaborate grid lines and detailed labels

In the second much of the non-data detail is eliminated

That leads to a cleaner design that focuses attention on the time-series itself

Unambiguously locates the altitude in six separate ways

  1. Height of the left line
  2. Height of shading
  3. Height of right line
  4. Position of top horizontal line
  5. Position (not content) of number at bar’s top
  6. The number itself

Any five of the six can be erased and the sixth will still indicate the height

Sometimes you don’t need symmetry

And you can save space by removing redundant halves

Redundant Data-Ink

Can also serve a purpose in some cases

If there is a time dimension involved, to show a full circle for example

And, similarly, in map plots to go “once around the world”

Redesigning

Some example ways of increasing the ink-data ratio:

Doing away with too much grid lines (or making them thinner)

Not plotting axes beyond the data range

Not plotting unnecessarily many axis ticks

Do you need axes at all?

Can you integrate a legend onto the plot, should you need one?

Can you have multifunctioning graphical elements in your plot?

Data Density

Simple charts often have a (very) low data density

Consider for example a bar chart with only two classes with only one value each (4 entries in total) that takes up considerable space when visualized

Maps, on the other hand, usually have a very high data density

Or other area plots, for example direct visualizations of matrices

Small Multiples (aka facet, panel, lattice, grid or trellis charts)

Well-designed small multiples are

  • Inevitably comparative

  • Deftly multivariate

  • Shrunken, high-density graphics

  • Usually based on a large data matrix

  • Drawn almost entirely with data-ink

  • Efficient in interpretation

  • Often narrative in content, showing shifts in the relationship between variables as the index variable changes (thereby revealing interaction or multiplicative effects)