Data visualization in a time of pandemic - #1: Finding reliable data

Oh no! Not another coronavirus post! Yes, I know, we are bombarded by pandemic content these days. My apologies for creating even more. However, it is not my purpose to bore you with more of the same, or to confuse you with pointless details. Being a passionate information designer I decided to have a look at good and bad practices in COVID-19 related content from a data visualization point of view. I hope this will be a useful and inspiring overview.



We are living in remarkable times. The novel coronavirus is causing an epidemic spreading with a velocity we have never experienced before. Busy long-distance air and rail traffic have made it impossible to contain the virus after its first outbreak in China. For the first time our modern world is confronted with a pandemic of this scale and magnitude, and our healthcare systems are being put to the test.

But in fighting these challenges, the world has never been as united as today. Research teams across the globe are working together to develop cures, social media are used extensively to keep everyone informed, and innovative companies are coming up with solutions to keep people at home and the virus at bay. Technology plays a crucial role in this fight.

As an information designer, I am specifically fascinated by the efforts of the data science and visualization communities. The newest developments in these fields are put to use to turn a complex and rapidly changing topic into easy-to-communicate visuals. In only a matter of days, nearly everyone is familiar with the ‘flatten the curve’ visuals, or Washington Post’s animations on the impact of social distancing.

In this post, we will explore some of the marvelous ways people around the world are using data visualization in the fight against the novel coronavirus.

Title: data visualization in a time of pandemic

Chapter 1: Finding reliable data

As noted by Edward Tufte, excellent graphics consist of complex ideas communicated with clarity, precision, and efficiency. At the core of a good data visual, therefore, lies accurate data. So before we start diving into coronavirus graphs, we will first take a brief stop at trustworthy data sources.

Sources of reliable data

There are currently three important places where one can obtain reliable and relatively complete aggregate data about the Coronavirus epidemic:

  • World Health Organization
    The World Health Organization publishes daily Situation reports detailing the number of confirmed cases and deaths per country. They also provide a Situation dashboard which is updated three times per day.

WHO Coronavirus Situation dashboard

WHO Novel Coronavirus Situation Dashboard

  • John Hopkins University
    Researchers at John Hopkins University also maintain a dashboard providing an overview of the current number of cases, deaths and recoveries on a per country basis. The underlying data is made freely available through GitHub.

John Hopkins University coronavirus dashboard

John Hopkins University Coronavirus Dashboard

  • European Center for Disease Control and Prevention
    The ECDC publishes daily statistics on the pandemic for the entire country (despite its name!). Data is published daily at 1 p.m. CET and is presented on a situation update page.

  • Our World in Data
    The team of Max Roser collects and combines all available data and information about the epidemic on a single page. This excellent summary provides interactive charts on many different topics ranging from the number of cases to symptoms, incubation period and fatality rate. Each chart comes with a downloadable data set.

Accuracy of data

Collecting and aggregating global data in a rapidly changing environment, such as during a pandemic, is obviously very tricky. None of the above datasets should therefore be considered an ‘absolute truth’, as minor errors are bound to happen. Such errors can be related to reporting difficulties or contradicting sources, or differences and shifts in methodology, but can also be due to minor errors such as typos.

As an example, let us compare the three datasets above for the total number of confirmed cases in Belgium (between March 1 and March 19) with the official numbers communicated by the Belgian government (which can be found here).

Table of confirmed coronavirus cases in Belgium, by different sources

Comparison between different data sources of the reported total number of confirmed COVID-19 cases in Belgium between March 1 and March 19, 2020.

Immediately we can note some discrepancies. The John Hopkins University data follows the government data most closely, with an exception on March 12 where for some reason the number was not updated.

The two other datasets (WHO and Our World in Data) appear to lag behind by one day up until March 16, possibly because WHO Situation reports are published at specific timings which don’t match accurately with government reporting timings. Also, these datasets miss the same update as the John Hopkins numbers (from 314 to 399 cases), they were not updated on March 17, and they appear to have a typing error in them (1.085 cases on March 16, while the official government number was 1.058).

Finally, Our World in Data temporarily stopped updating beyond March 17 because WHO shifted their reporting window: up until Situation report 57 the observed 24-hour time window ended at 10 a.m. CET, since then it ends at midnight. This causes a small overlap making it difficult to accurately compare data and analyze trends.

  • Update March 23: Note that Our World in Data stopped relying on WHO data as they found too many errors in the daily Situation reports. Instead, they switched to data provided by the ECDC.

In summary, John Hopkins University data most closely matches official government numbers (for Belgium).

Graph of the total number of confirmed coronavirus cases in Belgium in March, from different sources

Total number of confirmed COVID-19 cases in Belgium in March 2020, comparison between different sources.

Finding more data sources

If you are looking for alternative data sources, direct reports by governments, or data on specific regions or cities, I highly recommend the data section of the Coronavirus Tech Handbook, a crowdsourced document bringing together all the tools, datasets and visualizations on this topic.

The sheer amount of available data can make it a bit overwhelming, especially taking into account that new numbers are being announced almost constantly. When in doubt, I would advise to stick to the four most complete data sources listed above.


This is a multi-chapter blog post!

Continue reading:

For all your comments, suggestions, errors, links and additional information, you can contact me at koen@baryon.be or via Twitter at @koen_vde.


Disclaimer: I am not a medical doctor or a virologist. I am a physicist running my own business (Baryon) focused on information design.

Read more:

storytelling with data book dimensions

Storytelling with Data: Dataviz book review

The Storytelling with Data book has been on my wishlist as long as I can remember, because so many people recommend it as one of the must read dataviz books. So let's see what the fuzz is all about - here's my review!

Read More

small multiples slopegraph

Uncommon chart types: Slopegraphs

Slopegraphs appear in 'serious' newspapers, but they are very easy to create yourself. Use them if you want to compare how values have changed between two different points in time!

Read More

100000 deaths blog post cover

Data visualization in a time of pandemic – #6: Viral scrollytelling

In this final chapter, we’ll dive deeper into some of the insightful stories which have been published about the novel coronavirus and the COVID-19 pandemic. Rather than looking at single charts, we’ll highlight some long-form stories about the origin of the virus, how it works, and how it spread.

Read More

dashboard illustration

Five steps towards improving your dashboard

Today I would like to share with you the five steps I usually follow when I analyze and improve dashboards. If you are planning to analyze and improve your own dashboard, or maybe the dashboard someone else created and you want to provide feedback on, you could follow these five steps as well.

Read More

Dear Data book inside

Dear Data: Dataviz book review

Last February, on a cold and rainy day, I received the Dear Data book as part of a Dataviz Drawing workshop by Stefanie Posavec. A pretty large and heavy book, the kind you could put on your coffee table to show off (which I did!). Let's review it!

Read More

datawrapper featured image

Data visualization tools: Datawrapper

If you are writing articles online and need to quickly insert beautiful, interactive charts, maps or tables, Datawrapper is the tool you are looking for.

Read More

We are really into visual communication!

Every now and then we send out a newsletter with latest work, handpicked inspirational infographics, must-read blog posts, upcoming dates for workshops and presentations, and links to useful tools and tips. Leave your email address here and we’ll add you to our mailing list of awesome people!