Lecture 6 | Theory of Data Graphics I

Max Pellert

IS 616: Large Scale Data Analysis and Visualization

Theory of Data Graphics

  • Show the data

  • Induce the viewer to think about the substance rather than about methodology, graphic design, the technology of graphic production, or something else

  • Avoid distorting what the data have to say

  • Present many numbers in a small space

  • Make large data sets coherent

  • Encourage the eye to compare different pieces of data

  • Reveal the data at several levels of detail, from a broad overview to the fine structure

  • Serve a reasonably clear purpose: description, exploration, tabulation, or decoration

  • Be closely integrated with the statistical and verbal descriptions of a data set.

Principles of graphical excellence

Principles of graphical excellence

The principles

Avoid distorting what the data have to say: The Lie Factor

Distortions of data

“A graphic does not distort if the visual representation of the data is consistent with the numerical representation.”

“At any rate, given the perceptual difficulties, the best we can hope for is some uniformity in graphics (if not in the perceivers) and some assurance that perceivers have a fair chance of getting the numbers right.”

Distortions of data

Using areas for one-dimensional data with Lie Factor of 2.8

Here an increase of 454% is depicted as an increase of 4,280%, for a Lie Factor of 9.4

Show the data & induce the viewer to think about the substance: avoid Chartjunk

Much of the “winter” in data graphics from the early 20th until roughly 1970 is due to the strong assumptions then…

This lead to some bizarre ornaments and other things that designer added to scientific visualizations

Luckily, this is not the most pressing issue anymore as the by now predominant computerized way of creating graphics often got the seperate profession of such “chart designers” out of the way, but many other unnecessary elements can still be chartjunk nowadays

Chartjunk

Some minimalistic pre-set theme (for example theme_bw in ggplot2) can often be a quick fix to already get rid of some chartjunk

But usually it takes some tinkering, programmatically and also using external tools (such as Inkscape) to remove everything that is unimportant

But: Chartjunk has also attracted some interests in academia, particularly about the effect on memorability or engagement.

Bateman, S., Mandryk, R. L., Gutwin, C., Genest, A., McDine, D., & Brooks, C. (2010). Useful junk?: The effects of visual embellishment on comprehension and memorability of charts. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2573–2582. https://doi.org/10.1145/1753326.1753716

Borkin, M. A., Vo, A. A., Bylinskii, Z., Isola, P., Sunkavalli, S., Oliva, A., & Pfister, H. (2013). What Makes a Visualization Memorable? IEEE Transactions on Visualization and Computer Graphics, 19(12), 2306–2315. https://doi.org/10.1109/TVCG.2013.234

Context is essential for graphical integrity

https://twitter.com/reina_sabah/status/1291509085855260672

Leads to lack of integrity

Connected to cherry-picking the data

Be very careful with truncating axes!

The best graphic cannot help conceal selective reporting

The minimalist perspective of Tufte advocates plain and simple charts that maximize the proportion of data-ink (the ink in the chart used to represent data): The Ink Data Ratio

Acknowledgements

https://yy.github.io/dviz-course/

https://yyahn.com/dviz-course/m05-design/class/