There has been some recent media attention about a particular issue: Americans shooting guns. The Guardian warns that US toddlers kill more people than terrorists do. The Washington Post found that toddlers are injuring or killing others at least once a week. I listened to a particular Planet Money podcast about smart guns and why they are not the gold standard for guns and gun safety. And late last year, Obama encouraged the media to compare terrorist deaths versus gun violence. This resulted in an outpouring of the same set of statistics by many different media venues [CNN, Forbes, Washington Post, Vox]. The content of these articles is very relevant and important. I don’t want to take away from the real pain and disappointing reality of these statistics. In fact, as a data communicator, I want the visuals on stories like this one to help tell the story as best as possible. We know that the general populous is more convinced of a fact when it is accompanied by a graphic. So graphics can be very helpful when one is trying to tell a story the involved data. But, the power of a visualization is directly proportional to its ability to communicate information. I want to take a look at the visualizations used in these reports and talk about why some of the visualizations are much better than others.
The study of data visualizations is relatively recent. The most popular visualization advocate is Edward Tufte. His books on information design and visual literacy are referenced by many as we all try to make sure our visuals are best able to convey the right information. The folks over at Data Stories Podcast have an entire podcast dedicated to telling stories with data visualizations. All this research is necessary because some visuals are very pretty, but not very readable. I think there are three things I believe a graphic must do:
- Convey numerical data in an easily to read format
- Convey high level take-away & emotional content at first glance
- Have up to date and accurate information.
You can see the cute toddler iconography, so they get points for style. However, the graphic isn’t actually helpful. There is no way to easily read that there are 10 toddlers per line. If I want to know how many brown colored toddlers there are, (13), I have to count each and every toddler. The pink toddlers are even worse because they are spread over three different lines. In fact, when I first looked at this figure, I just read the information from the top paragraph. There are 13 brown, 18 pink, 10 yellow and 2 olive toddlers. And because there are approximately the same number in each category, I don’t learn anything visually from the image. The graphic fails the first two goals of a visualization (I’m going to assume the data is accurate for now). Does that mean that this type of graphic is always going to fail?
Colored icons to represent individuals isn’t always a poor choice. Consider, if I wanted to convey that there were 30 of one color and 1 of another, then the visual image is compelling. I can read a description of 30 to 1 and not get the full impact of that relationship. But I get emotional content from this type of visualization. I particularly like this type of colored/iconography for climate change data. I can write down that 97 out of 100 scientists believe in climate change. But the reader will gather new emotional information from the image that the numbers don’t give. Even if you don’t have a lot of quantitative literacy, you can look at the visual and say, “wow, that’s a lot!”.
Another way to explain the same type of story (that two numbers are very different in size) is to use a bar chart. This is fundamentally a good idea because humans are good at distinguishing lengths. This is usually a good way to give quantitative and emotional data. However, we have to be careful to not confuse the quantitative information. This next graph is trying to convey information about terrorist killings versus gun-related deaths. And the author does this weird thing where several years of data are combined in one bar. This would be okay if the number of years was equal for each bar. However, that is not the case.
In a verbal discussion the comparison presented above would be acceptable. We might say, “More people died from gun violence in the last year than died from terrorism over the last 45 years combined!” But, as a graph, one has to read the fine print to understand the relationship. And it’s a hard relationship to understand. The author is changing two variables at the same time. E.g. Different combinations of years and different types of negative events. Numerically I’m confused and emotionally I’m uncertain. What happened to terrorism over the last year? Are you purposely excluding 2015-present for terrorism because it got worse? Why wouldn’t you consider gun violence over the same time frame. Surely it would be more compelling to have all the gun-violence deaths from 1970-2014 because that number is surely greater than the 9,948 people shown in this graphic. All in all, this graphic is my least favorite. It feels like the reporter gave up before they could actually find comparable data, thus failing the third objective of a visualization. The weird part is that the rest of the article is quite nuanced. I think it would have been much better to have an oversized quote rather than a graphic.
Sometimes we data visualization folks get too clever for our own good. This happens pretty much every time anyone uses volume as a measure of quantity versus length. Humans are very bad at guesstimating volume. E.g.: Is the taller thinner bottle of water more than the short fat bottle? This next graphic uses triangles to show the size difference in gun violence versus terrorism. Take a look.
From an emotional perspective the visualization is good. The little triangles are much smaller than the big triangles. The triangle is attempting to be approachable, but what is the triangle representing? It’s the count of events equivalent to the height of the triangle? The volume? Thus when I start considering numerical content, I still have to read every number on the graphic. From that perspective, it’s no better than a table of information. Worse yet, when I start looking at the numbers (because they practically forced me to), I notice that the data only goes until 2011. This was published in 2015, what happened to the last three to four years of data? This graphic is doing a good job at emotional content, but the numerical content is hard to read and the data is not up to date.
To finish off this article, I wanted to show you my favorite visualization for this type of content. It’s comparing the terrorist vs gun violence against the same y-axis.
In the graphic above, the graphic designer uses predictable y-axis, starting from 0, to get a clear sense of the different in scale. Further, the designer uses regular intervals along the x axis to show how these things have changed over time. A potential negative aspect of this graph is that one still needs to read the blue numbers to see that the numbers do change year-by-year. However, this doesn’t bother me too much as the important take-away from this graphic is the relative size of the two metrics under consideration. They are very different!
While you may think this graphic isn’t as flashy as the others, I think it’s doing a great job at conveying the appropriate information. When I look at this graph I can easily determine that the damage done by terrorists targeting Americans is consistently and dramatically lower than gun violence of Americans targeting Americans. I can easily interpret the numerical data and the emotional data. My only wish is for the graphic to include 2014 data, at least. This graph teaches me something without me needing to read every number. It informs me of the numerical information in a quick way that allows the author to better communicate their point. The above example is better than a verbal description of the same information.
What other causes are not getting appropriate attention because we can’t convey the data in a compelling way? Is there a way to make this graphic more visually engaging without losing its benefits?