Data visualization in a time of pandemic - #3: Mapping the virus

This multi-chapter post is a work in progress. The first five chapters are currently finished, the final chapter will be added as soon as it is available.



Title: data visualization in a time of pandemic

Chapter 3: Mapping the virus

A pandemic has a strong geographical factor attached to it, so obviously we are drawn to using maps to visualize how the virus is spreading. Both data visualizers and their audience simply love maps, and I personally do to. As a child, my (geographical, historical, biological, even biblical) atlases where my favourite books and I could browse through them for days. However, pretty as they may be, maps have their own pitfalls and caveats. So be prepared!

mapping the virus

Beauty in times of despair

Let’s start with some of the most well-designed examples of maps I have encountered during my research for this chapter. The absolute winner, in my opinion, are these clean but very effective maps by the Washington Post:

Map showing the global spread of the coronavirus on March 27, 2020 (Washington Post).

Map showing the global spread of the coronavirus on March 27, 2020 (Washington Post).

To further clarify things, these maps are complemented by a simple table detailing the exact number of confirmed infections or deaths. This gives the reader the choice to look at the broader picture, dive into the detailed numbers, or both.

Table showing the global spread of the coronavirus on March 27, 2020 (Washington Post)

Table showing the global spread of the coronavirus on March 27, 2020 (Washington Post).

It should be noted that the BBC uses very similar, equally beautiful maps. These are examples of proportional symbol maps, or what most normal people simply call bubble maps. But why exactly do these bubble maps work so well?

Mapping issues

One of the most common issues encountered when creating data maps is the impact of population and population density. If we simply color a map according to the presence of a certain parameter, we can easily mask the fact that we are actually looking at a map of an underlying different parameter, such as population density. This may sound a bit abstract, but the excellent xkcd made an — as always — amazing cartoon (chartoon?) about this which explains it much better than I can:
xkcd cartoon 1138
Now while this may sound funny, it is something that unfortunately happens quite regularly in real life. In particular election maps are very vulnerable for this kind of problem. For example, this is a (rather famous) map showing the election result (per county) of the US Presidential Election in 2016:
While this map has been used several times to claim a ‘landslide’ victory for the Republican candidate, it is actually rather useless, as it completely ignores the fact that the overwhelming majority of Americans lives in the cities near the Upper East Coast (New York, Washington, Boston,…), near the Great Lakes (Chicago, Detroit), the West Coast (Los Angeles, San Francisco, Seattle), or in Florida. In short: land area does not equal population. While a map like this one is not strictly lying, it is (intentionally or not) hiding the fact that the split between Republican and Democratic votes was nearly 50–50.
Many of the maps published during this coronavirus crisis suffer from a similar problem. Take for example this map by ABC News showing the countries where COVID-19 cases have been confirmed:
Countries and territories with confirmed cases on March 10, 2020 (ABC News)

Countries and territories with confirmed cases on March 10, 2020 (ABC News).

So, a choropleth map. In such a map, regions are again colored, but the value of the color (lightness or darkness) depends on the underlying parameter, for example the number of infections in a country. In its most basic form it looks like this example by CNBC:

Choropleth map of reported coronavirus cases worldwide as of March 17, 2020 (CNBC)

Choropleth map of reported coronavirus cases worldwide as of March 17, 2020 (CNBC).

Choropleth maps can be particularly helpful in comparing different counties or regions within a country, such as this map by CNN:
Choropleth map of the spread of the coronavirus in China as of January 26, 2020 (CNN)

Choropleth map of the spread of the coronavirus in China as of January 26, 2020 (CNN).

However, choropleth maps have their own unfortunate downsides and pitfalls. I will not go into much detail here, as everything was already written down excellently by ‘cartonerd’ Kenneth Field. Let me just summarize:

  • choose your colors or color scheme responsibly,
  • choose your categories responsibly, and
  • use relative numbers to avoid population density distortion.

Or, just maybe, a bar chart might be a better choice:

coronavirus china bar chart

🎵 The map isn’t the best way to show your data, so the bar chart is where I go. (Source: Kenneth Field)

Another limitation of choropleth maps is that small nations or regions are nearly impossible to see. Again, area size is messing up our ability to interpret the map correctly. For example, try to find the number of cases in Singapore, Luxemburg or Barbados on the maps of Our World in Data:
Choropleth map of the total confirmed COVID-19 cases as of March 26, 2020 (Our World in Data)

Choropleth map of the total confirmed COVID-19 cases as of March 26, 2020 (Our World in Data)

Bubble maps, such as the ones by the Washington Post shown above, avoid this trap because each nation gets its own bubble, independent of area, population, or population density. This is what makes this kind of chart so successful to map a wide range of values in a wide range of countries around the globe.

There is only one minor downside: bubbles can start overlapping each other when two neighbouring regions have very large values (or one of them has a large value while the other only a small one). Then your bubble chart might start looking like this:

Not particularly helpful, I’m afraid… There is no perfect solution to avoid this, but play around with the opacity of your bubbles, be clever with bubble outlines, and the result might be both beautiful and effective:
Bubble chart showing confirmed coronavirus cases throughout Europe as of March 27, 2020 (Washington Post)

Bubble chart showing confirmed coronavirus cases throughout Europe as of March 27, 2020 (Washington Post)

The return of the table

I already hinted earlier that in some cases, a simple bar chart might be a better option than a complicated map. As Leonardo Da Vinci said: “Simplicity is the ultimate sophistication” (except that he never said that). Another simple but effective alternative might just be… a table.

Many great examples can be found, including the Washington Post ones at the beginning of this chapter, but I was particularly charmed by the Datawrapper ones by Lisa Charlotte Rost, with a clever use of color to bring a touch of optimism to this heavy subject matter:

Datawrapper tables by Lisa Charlotte Rost (screenshot: March 18, 2020)

Datawrapper tables by Lisa Charlotte Rost (screenshot: March 18, 2020).

Okay okay, not everything is perfect when using tables. For example, they can also be used to misinform, or at least to distort information or present data in a way that suits you best. For a while, the following table was popular on social media in the Netherlands, showing how the country was following the exact same pace as Italy, with only a few weeks of delay. Panic!
Table comparing the number of infections and deaths between the Netherlands and Italy

Table comparing the number of infections and deaths between the Netherlands and Italy.

However, it was intelligently shown by RTL Nieuws (in Dutch) that the situation looks completely different when you choose different dates to start the comparison, such as the date of the first death:
Table coronavirus netherlands

Same data (well, more or less), different story.

Also, differences in age distribution among the population have an impact on the death rate, so it’s rarely a good idea to blindly start comparing different columns or rows with each other, without thinking things through. Remember: if creating panic is your goal, you will always find some data somewhere presented in such a way that you can do so.

There are many, mány more amazing things you can do with tables, also in coronavirus times, but that will be something for another chapter!

Nightmare maps

To end this chapter on a lighter note, let’s just have a look at some garbage maps from around the web. Starting with the Daily Mail committing some serious data visualization crimes, showcasing what Andy Kirk aptly calls ‘the Staircase from Hell’:
staircase from hell
Metro sits in the same boat, but at least they blow up the map of Europe to ensure that poor San Marino isn’t left out:
another horrible coronavirus map

BBC, by the way, has written an interesting story on an old map showing air travel routes going viral (pun not intended) and causing panic because of poor journalism, such as this badly chosen tweet by the Sun:

the sun misleading tweet

Journalism: you’re doing it wrong.

Finally, if you would ever think about using a pie chart as an alternative to a map… just don’t:

horrible coronavirus pie chart

I like pies — Pecan pie! Frangipane pie! Key lime pie! — but not this kind, thanks. (Source: European Scientist)


This is a multi-chapter blog post!

Continue reading:

There is one upcoming chapter for this blog post:

  • Coronavirus storytelling and scrollytelling

For all your comments, suggestions, errors, links and additional information, you can contact me at koen@baryon.be or via Twitter at @koen_vde.


Disclaimer: I am not a medical doctor or a virologist. I am a physicist running my own business (Baryon) focused on information design.

Read more:

dashboard illustration

Five steps towards improving your dashboard

Today I would like to share with you the five steps I usually follow when I analyze and improve dashboards. If you are planning to analyze and improve your own dashboard, or maybe the dashboard someone else created and you want to provide feedback on, you could follow these five steps as well.

More info

Dear Data book inside

Dear Data: Dataviz book review

Last February, on a cold and rainy day, I received the Dear Data book as part of a Dataviz Drawing workshop by Stefanie Posavec. A pretty large and heavy book, the kind you could put on your coffee table to show off (which I did!). Let's review it!

More info

datawrapper featured image

Data visualization tools: Datawrapper

If you are writing articles online and need to quickly insert beautiful, interactive charts, maps or tables, Datawrapper is the tool you are looking for.

More info

We are really into visual communication!

Every now and then we send out a newsletter with latest work, handpicked inspirational infographics, must-read blog posts, upcoming dates for workshops and presentations, and links to useful tools and tips. Leave your email address here and we’ll add you to our mailing list of awesome people!