Week 3 Reading Post — The Truthful Art: Chapter 4 and Chapter 5
Chapter 4: Of Conjectures and Uncertainty
Cairo begin this chapter by defining science as a “set of methods, a body of knowledge, and the means to communicate it,” and going over the scientific discovery algorithm:
1. You form a plausible conjecture about some curious phenomenon; the conjecture is designed to describe, explain, or predict this phenomenon’s behavior, but it is just a hunch.
2. You create a formal and testable hypothesis from your conjecture
3. You study and measure the phenomenon (under controlled conditions, if possible) and these measurements become data to use in testing your hypothesis.
4. You draw conclusions based on the evidence you’ve obtained; these data and tests may result in a rejection of your hypothesis, or a tentative acceptation.
5. After repeated tests and evidence, and after multiple interrelated hypothesis that describe/explain/predict the phenomenon have been reviewed by the scientific community, you may be able to assign a theory to this phenomenon.
Cairo then goes through an example of time on Twitter in relation to productivity (a made-up example, but I have actually read The Shallows by Nicholas Carr as well, and I find changes in brain wiring and decreases in attention and productivity due to extensive Internet and technology use quite believable and likely). Through this example Cairo explains that conjectures must make sense, be testable, and be plausible, even if they are wrong. Conjectures must also be falsifiable; if they can’t be refuted then they can’t be good conjectures. In addition, conjectures need to have several components that depend on one another to make the conjecture make sense. Cairo then describes different variables — independent and dependent — as well as the different classifications of variables: nominal, ordinal, interval, and ratio. There are a few different studies one can one to test one’s hypothesis and conjecture, as well as different types of sampling. Sampling is extremely important because without a random sample, you cannot be sure that the study’s inferences are applicable to the entire population. Additionally, scientific studies quite often have randomness in the data, called sample variation, which can create uncertainty about trends or the hypothesis.
I really enjoyed reading this chapter. I come from a more scientific and less journalistic background, and I’m currently in school for a master’s degree in environmental science communication and media. I found this discussion of the scientific process and data quite engaging and I look forward to incorporating these methods into my future studies and associated data visualizations. However, I am really interested to read/hear more about the sample variation and uncertainty that appears frequently in scientific studies and tests. Science by nature is open to and invites criticism and uncertainty. That’s how scientific hypothesis get tested over and over again until they eventually become theories. This is especially prevalent to my work because climate change science and predictions contain a great deal of uncertainty, and this lends itself to a platform upon which climate change skeptics can twist that data and uncertainty into discourses that support their views. Even the word ‘prediction’ itself implies uncertainty. This is hugely problematic; it means that a display of data that predicts escalating climate change sometimes still does not reach the audiences that it needs to reach. This is something that I hope to tackle in my studies and career, so I am looking forward to reading Chapter 11.
Chapter 5: Basic Principles of Visualization
In this chapter, Cairo details a few ways to choose graphic forms depending on the data and the relationships that you want to show. While no one answer or system can define which graphic is best, choosing a good visualization is important because envisioning evidence is often a way to make us understand something better. In order to choose the best method to visually encode data, one needs to consider both the properties of the data and the relationships you want to highlight, as well as the visual properties of your chosen graphic. Cairo lists 4 suggestions to choosing the best graphic form:
1. Think about the tasks you want to enable or the messages and relationships you want to convey.
2. Try different graphic forms and test things out.
3. Arrange the components of the graphic in order to make it easy to extract meaning.
4. Test the outcomes and see how people react to and understand your graphic.
Cairo mentions a few different resources that have categories of graphics displayed visually: the Data Visualization Catalogue, the Essentials Website, and Cleveland and McGill’s hierarchy of elementary perceptual tasks. I found the latter resource quite helpful and will use this in my future exercises and infographics. Cleveland and McGill’s scale, however, may not be useful when dealing with maps and global data patterns. Furthermore, multiple graphic forms may be necessary when you are trying to show both a general, large-scale pattern, as well as smaller details within this pattern.* Other things to pay attention to in choosing a graphic and choosing how to represent data:
· plotting data directly
· plotting differences when that relationship matters more than the individual variables
· when to use indexed numbers
· using logical and meaningful baselines
· what to reveal in the data and how to do so
· plotting data on several charts with dissimilar scales*
*While this chapter was extremely helpful, what I found most interesting was the note about creating multiple charts with the same data showing different scales and patterns. I think this is important to remember when creating an infographic and attempting to show relationships and patterns in any dataset. Sometimes I think certain graphics are slightly confusing, or take a little bit too long to figure out and ‘read’ effectively. Examples I’ve included below are Figure 5.7 below from the textbook and Figure A from the New York Times’s opinion article How Every Member Got to Congress. I find both of these graphics very informative and quite fascinating, the data patterns they show are very interesting. But overall I found both of these a little confusing and it took me a few minutes to figure out the large-scale patterns as well as the detailed relationships in the data. I am not a data visualization expert, but I wonder if there are better ways, perhaps with more graphic forms to explain different scales of data patterns, to represent these data sets.