owid / covid-19-data Goto Github PK
View Code? Open in Web Editor NEWData on COVID-19 (coronavirus) cases, deaths, hospitalizations, tests • All countries • Updated daily by Our World in Data
Home Page: https://ourworldindata.org/coronavirus
Data on COVID-19 (coronavirus) cases, deaths, hospitalizations, tests • All countries • Updated daily by Our World in Data
Home Page: https://ourworldindata.org/coronavirus
It will be useful to get graph of people who are currently infected. There are stats on deaths and cases, but unfortunately, no data on those who recovered.
I was under the impression from your web site that the dataset included a Test Count column by date for each location. I can't find test counts anywhere in the downloaded data. Can you direct me to it please?
There are numerous stats on cases and deaths, but why not active cases? It makes it useful with the active cases data to show which countries are improving/passed their peak.
Hi,
Can you add data for Singapore? Testing data can be found here at their ministry of health
Hi, It would be super useful to have historical data over time in addition to the current cumulative number plus daily change. Not sure if you have that data or not but it would be very powerful. Thanks!
My colleague Riccardo and I have created this github page to answer each question in your checklist for COVID-19 testing data with all the documentation at our disposal.
We believe it might be useful.
Thank you so much for you phenomenal work!
Twitter:
It's reading numbers like "1,042", "0,234" and "17,806", does anyone know what's going on and how to fix it?
The data for the (excellent) Number of COVID-19 tests per confirmed case is averaged since the "beginning of records" (#45).
As time goes on, if a country has an outbreak in cases, this average will take a long time to move (#46). Whilst this data is valuable, another version of the data that shows a daily or 7-day moving average over the data has the opportunity to inform debate and represent the situation in a timely and responsive manner.
Would it be possible to have a graph showing this daily or 7-day average? Or indeed extend the existing graph with a second slider to select the max days to average over?
I appreciate this is the data repo so not the right place for this suggestion. If you could direct me to the correct place for these suggestions I'd be grateful. Thank you very much for your hard work, excellent resource and your consideration of this suggestion.
Hello,
Where can I get CFR figures per geographic region?
Many thanks,
Dan.
There is no Spain in main CSV
Today we are not able to read data. Is there any change or issue in the file
https://github.com/owid/covid-19-data/blob/master/public/data/ecdc/full_data.csv
tks
USA deaths are showing up 4928 on 4/16/2020 but in other sources, it's around 2500 can you please the check the data?
Hello,
Is it possible to add tests performed for Russia as well into https://github.com/owid/covid-19-data/tree/master/public/data/testing ?
There are official everyday reports since March, 25: https://xn--80aesfpebagmfblc0a.xn--p1ai/ofdoc/#reports
Lebanon data is offset by one day in Owid. The first case was reported on Feb 21, and not Feb 22. This is shifting the plots and is specifically impacting the current day, which shows the data from the previous day...
Is this related to https://www.bizpoint.com/l/en-covid ?
I see no direct link to their site in the visualization so I thought that it was suspicious.
The full covid-19 dataset is aggregated at the country level. I would like to access the data at the state level in the U.S. as well. Can these data please be made available for download either as additional cases in the full dataset, or as a separate file?
The link to "Data on COVID-19 maintained by Our World in Data" in the title seems to be broken.
Hi,
Can you add monthly average temperature and humidity for each country for the given month?
Many thanks,
Not sure why this value is blank in the data. Gugu suggests the value of 67,052 from the CIA world factbook.
I found a value of 97,857 for Jersey.
It looks like the Spain data for 4/30/2020 is missing. Is it available?
Can you modify the script to report the cases in columns instead of rows for historical data? It makes it easier to work with for data extraction. Transposing them in Excel or other software is extremely time-consuming for data analytics and this will save a considerable amount of time when working with the dataset with other tools.
Hi,
I'm trying to import this excel with the URL "raw format" (https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/testing/covid-testing.xlsx) on my PowerBI Desktop and it throws an error of data file format.
If I try to import the file locally it says the same, but if I try to "Save as" the file in a new file with the same data, it works.
It seems that the file haves some invalid format, can you try to save the file again and upload again please?
Serbia data mismatch on 4/13/2020
I am using your data on https://mda-covid-19.appspot.com/ and I spot mismatch on 4/13/2020 and after in your full_data.csv file. According to your file:
date | location | new_cases | new_deaths | total_cases | total_deaths |
---|---|---|---|---|---|
2020-04-13 | Serbia | 250 | 6 | 3630 | 80 |
2020-04-14 | Serbia | 0 | 0 | 3630 | 80 |
2020-04-15 | Serbia | 424 | 4 | 4054 | 84 |
2020-04-16 | Serbia | 819 | 15 | 4873 | 99 |
According to Serbian Government data should be:
date | location | new_cases | new_deaths | total_cases | total_deaths |
---|---|---|---|---|---|
2020-04-13 | Serbia | 250 | 6 | 3630 | 80 |
2020-04-14 | Serbia | 424 | 5 | 4054 | 85 |
2020-04-15 | Serbia | 411 | 9 | 4465 | 94 |
2020-04-16 | Serbia | 408 | 5 | 4873 | 99 |
Please fix.
Thank you very much
I'm not sure how this site gathers its data, but the numbers for Argentina's daily new cases are completely off. You can see in this local newspaper the actual data:
To compare, here are the values from this repo (taken from https://ddi.sutd.edu.sg/)
In particular, that ~350 point is ridiculously large.
Hi @edomt ,
I am from Portugal and I am building a Dashboard in PowerBI about this Pandemic and I am interesting in show a chart with Total Cases vs Total Tests in Portugal but I couldn't find that information in the Github source that I am using DSSG-PT but I found that your csv data "covid-testing-latest-data-source-details.csv" contains tests for Portugal and you refer as source the same source where I am retrieving the data for Portugal "Portugal - cases tested 2020-04-21 https://github.com/dssg-pt/covid19pt-data/blob/master/data.csv" but I couldn't find that info in that csv. Could you please help me?
Best regards,
Fábio Barnabé
Regarding #45 I think it would reduce confusion to add a foot note saying that data is "always averaged from the beginning of records" or some wording like that. When you take a smaller range of time the title says for example:
Number of COVID-19 tests per confirmed case, May 6, 2020 to May 7, 2020
But the data is actually an average from a larger range.
Thank you again for this great resource and for your consideration of this proposed change.
p.s. I just released this is the data repo so not the right place for this suggestion. If you could direct me to the correct place for these suggestions I'd be grateful. Thank you.
On 12 April, the website on coronavirus by Vietnamese MoH was updated, making the number of tests no longer appear when first loading.
Actually they are still providing this number, just not very obvious. We'll have to click the "Infographic Việt Nam" (red button) lower in the page to see the testing number. As of morning 15 April, this has increased to 132,771 people tested.
The source of the image is here (it's also by the MoH, not sure why they post it on a second website)
The UK daily testing numbers cannot simply be derived from the cumulative totals (by finding the difference from the previous day's total), since the total incorporates revisions that may not just apply to the previous day.
For example, on 25 April the notes say:
The difference between the cumulative numbers from today and yesterday for people tested is 50,499 higher than the daily increase figure. Cumulative testing figures include 50,499 retrospective reports of people who tested negative between 31 January and 24 April.
The page from which the testing numbers are collected (https://www.gov.uk/guidance/coronavirus-covid-19-information-for-the-public#number-of-cases-and-deaths), includes daily totals, so they can be directly added to the data in this repository.
I have been maintaining a repository of the raw HTML for the source page, collected every day (see coronavirus-covid-19-number-of-cases-in-uk-*.html in https://github.com/tomwhite/covid-19-uk-data/tree/master/data/raw), which can be used to get the corrected numbers for daily people tested.
I pulled out the correct figures for the dates that need fixing here:
Date DailyPeopleTested
2020-04-08 12959
2020-04-10 13543
2020-04-13 10745
2020-04-20 14106
2020-04-25 23115
Examples:
From deaths.csv: 1 new death
Entry in full_data.csv:
date | location | new_cases | new_deaths | total_cases | total_deaths |
---|---|---|---|---|---|
2020-02-02 | Philippines | 1 | 2 | 1 |
From cases.csv: 1 new case
Entry in full_data.csv:
date | location | new_cases | new_deaths | total_cases | total_deaths |
---|---|---|---|---|---|
2020-01-24 | Singapore | 1 |
From cases.csv: 2 new cases
Entry in full_data.csv:
date | location | new_cases | new_deaths | total_cases | total_deaths |
---|---|---|---|---|---|
2020-01-24 | Vietnam | 2 |
I pulled the history of the updates to the data file public/data/owid-covid-data.csv
and compiled them all into one large file so I can compare changes in counts for a given date as newer reports come in (like I did for the NYC data - https://www.linkedin.com/feed/update/urn:li:activity:6658517451480805376/). The results are in the file public/data/history.csv
in my fork of your repository (https://github.com/hjstein/covid-19-data).
In looking at the United States data, I noticed that the reports all have the same number for each date. For example, the data shows 30,613 new_cases
for 4/8 in every update from 4/16 through the present. I'm surprised by this because when I did the same for the NYC data, I found the reported new cases for 4/8 kept increasing as newer reports came in (data and analysis available at https://github.com/hjstein/coronavirus-data). So, I would have thought if the counts for NYC for 4/8 aren't fully known until about 3 weeks later, then the totals here for the USA should be getting revised in later reports as well.
So, my question is, how is the new_cases
count being calculated, and where is the data coming from?
Thanks.
Hi,
I noticed that for testing reports, For countries where you only have certain dates like Singapore 7 April 2020. When you add the country in this case Singapore, it will only be shown on the chart if you adjust the end date to on or before 7 April. If it is left at the current like 14 April as of today, Singapore will not appear on the chart. I also noticed the same for a few other countries but do you recall which ones now.
Excellent website, however one limitation is that ECDC data is not all reliable. For example number of deaths on April 23 in Switzerland is off by 25% (ECDC reports 1216 deaths when it is actually 25% more with 1526 deaths: https://interactif.tdg.ch/2020/covid-19-carte-suisse/ )
I might be completely missing something here, but if Google Translate is correct, these are the daily deaths in Sweden at https://experience.arcgis.com/experience/09f821667ce64bf7be6f9f87457ed9aa
And these are the daily deaths in Sweden on https://www.worldometers.info/coronavirus/country/sweden/
Why such a difference?
72,000 tests now completed in Ireland as of 13/04/2020
Source:
https://www.rte.ie/news/coronavirus/2020/0413/1130184-coronavirus-ireland/
Spain data is missing 1 day vs. all other countries. The issue started about 3 days ago, was briefly fixed, and now is back. Currently for example, all countries have data up until April 30, while Spain only has data up until April 29. This is an issue for dashboards that rely on owid data
Number in total_cases.csv (DO94) is 2626. Actual value should be 2766.
This has been carried over to new_cases.csv DO94, where value is 0 (zero), but should be 140.
Total deaths DO94 is 37. Actual value 43.
New deaths DO94 is 0. Actual value 6.
Source: Malaysian Ministry of Health daily report 2020-03-31: https://www.facebook.com/kementeriankesihatanmalaysia/photos/a.390879946236/10156888001176237/?type=3&theater
In the public data, there is no data of Spain for May 7th.
in https://github.com/owid/covid-19-data/blob/master/public/data/ecdc/locations.csv#L50
Curacao appear under 3 names :
and so does Czech Republic (with 2 occurrences) https://github.com/owid/covid-19-data/blob/master/public/data/ecdc/locations.csv#L54
Suggest to implement the following:
Could be storing states on cookies, or url redirects/parameters
Hi
thanks for the work, could you help on defining your keys?
An example:
"iso_code": "ITA",
"location": "Italy",
"date": "2020-01-05",
"total_cases": 0,
"new_cases": 0,
"total_deaths": 0,
"new_deaths": 0,
"total_cases_per_million": 0,
"new_cases_per_million": 0,
"total_deaths_per_million": 0,
"new_deaths_per_million": 0,
"total_tests": null,
"new_tests": null,
"total_tests_per_thousand": null,
"new_tests_per_thousand": null,
"tests_units": "",
"population": 60461828,
"population_density": 205.859,
"median_age": 47.9,
"aged_65_older": 23.021,
"aged_70_older": 16.24,
"gdp_per_capita": 35220.084,
"extreme_poverty": 2,
"cvd_death_rate": 113.15100000000001,
"diabetes_prevalence": 4.78,
"female_smokers": 19.8,
"male_smokers": 27.8,
"handwashing_facilities": null,
"hospital_beds_per_100k": 3.18
What do the following means?
tests_units
"extreme_poverty": 2, (what is 2?)
"handwashing_facilities": null, What is it? What do you mean?
"hospital_beds_per_100k": 3.18 This can be dangerous data if we don't say what beds for what hospital division
"gdp_per_capita": 35220.084, what is the value, million, thousand?
"cvd_death_rate": 113.15100000000001, what value is this rate?
Basically we'd need some definition on all the keys to better understand and work with this data
Thanks a lot
As per this graph, the United Kingdom's ratio is shown as 5.7 as of May 7th
However the data source (UK Department of Health and Social Care and Public Health England) shows the ratio at ~12. With a 14 day moving average it's still about 10.
Link to spreadsheet.
I was wondering where the value of 5.7 comes from? Thank you.
Hello,
Can you please return # of tests data for Russia?
It was present in 11-April file but not later.
If needed, # of tests data is available in wiki page:
https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Russia
Thank you
There used to be data for zimbabwe from march 15 till may 4.
Now it's just may 6,7,8.
Was this a mistake?
Dear team,
As I can see in https://github.com/owid/covid-19-data/tree/master/public/data/testing there are no Spain tests data source, can you give me some more info about it? Is there a reason fir this lack of data?
Thanks in advance,
Germán
Hi, thanks so much for your work, it's an excellent comprehensive resource.
The testing data (csv) only lists countries by name - is it possible for you to add ISO3 codes to this file? It would make joining with other data much simpler for users!
Thanks :-)
Hi,
first of all, thanks for your work and making everything available to the public!
For 2020-04-03, ECDC reports new cases and new deaths for Germany as 0,
on https://opendata.ecdc.europa.eu/covid19/casedistribution/csv/. This may
or may not be a problem. At least for the following day (2020-04-04), they report
such a delta that it's 79696 total cases, 1017 total deaths on OWID and, currently,
on their website.
This corresponds to WHO Situation Report 74, which is for 2020-04-03, the day before!
For 2020-04-04, the WHO has Situation Report 75,
which has 85778 total cases, 1158 total deaths, which is exactly
what the German Robert Koch Institut (RKI) is (currently) reporting
(on their website).
For other locations, ECDC and WHO SR 75 are exactly in line
(Italy, Spain, Switzerland, Turkey, Belgium, Netherlands, possibly more..)
or are largely similar (UK off by 4 total cases, total deaths exact in line);
ok, some seem to differ (France, China, ...). But Germany has numbers
exactly as in SR 74. So, effectively a delay of 1 day was introduced.
This should not be due to RKI data being updated too late.
(Website linked above says data from 2020-04-04 00:00 (CEST),
updated on the web page at 10:10 (CEST, so 9:10 CET).
WHO and ECDC both say their data is from 10:00 CET,
so this should be in time.)
But somehow the official RKI data about Germany get into WHO reports,
but not into ECDC (timely), it seems.
Is this something that ought to be fixed? (I haven't tried to contact ECDC.)
Or is this something expected / everything ok?
Thanks,
Fabian
I would like to know what the meaning is for negative figures in the new cases column. May it be a mistake or does it mean something else?
As an example, the new cases figure for Spain on April 19th is -1430.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.