midas-network / covid-19 Goto Github PK
View Code? Open in Web Editor NEW2019 novel coronavirus repository
2019 novel coronavirus repository
Regular/Daily Briefings:
22 Jan 2020 - Wuhan local health commission announced that all briefings would go through Hubei Province rather than the local commission.
Periodic:
Hey folks,
I'm collecting the official news provided by Singapore Gov. I did some text analysis and generate the cluster view of Singapore cases. I'm willing to contribute and get to know more and help to generate graph-structured data during COVID19. Please find the repo here.
https://github.com/lushl9301/Statistics-of-Singapore-COVID-19-Cases
Willing to adjust my data format so that to contribute to the data pool.
Thanks and Regards
Thank you very much
As of 2020/03/25 17:30 commas are being used in the status table on the Ontario Ministry of Health's page which is breaking the CSV format (e.g Cases_in_Ontario_2020-03-26.csv)
String quoting of CSV fields could be added to address this
Most of the content of parameter_estimates/2019_novel_coronavirus is from 2020-02-1X, mid february. Did you move the effort elsewhere ? @LucieContamin
I'm running an hourly-updated system producing future-prediction coefficients. The output is JSON format.
An example showing it in action is here: https://cryptinc.com/covid19/covid19_predictor.html
It would benefit from someone with JavaScript and Charting/Mapping skills turning that data into an interactive tool to help people see what is in their immediate future. So far, it's proving to be accurate to within 3% when looking forward a few days, with good accuracy on mid term predictions as well.
A dozen curated global sources feed the back ends.
Would be good!
Cheers!
I noticed some weird values of the "death" variable in the line listing data. For example, the 27th case in Japan died at 02/13 but the death variable is 1 in the file 2020_02_18_1800EST_linelist_NIHFogarty.csv and 2020_02_19_1800EST_linelist_NIHFogarty.csv but becomes 1581552000 in all the following files starting from 2/20 to 3/16. There are 50-ish cases with similar issues though with different strange numbers. But interestingly for each case the weird number is consistent based on my partial observation.
Is this weird number indicating a death? My guess is that this input is recorded as the death date but then transformed into an integer accidentally.
On https://github.com/midas-network/COVID-19/blob/master/software_tools/software_catalog.csv,
maybe column "URL_original" should be moved to be the 3rd column. Just my 2-cent. Feel free to close it without any change.
Why don't you put Hong Kong in the tie-in section in China?
The following papers contain parameter estimates for days from symptom onset to hospitalization that could be added to the dataset here:
I'm new to GitHub, so apologies in advance if this is not the right channel or format to raise this.
I've been doing my best tracking online modeling efforts here:
https://github.com/vejmelkam/covid19-models
Maybe it would be useful as a reference, or just copy it here if interested. So far I was unfortunately unsuccessful in getting any help mapping modeling efforts in different languages (other than English, Czech, Slovak that is).
The first numbers in the estimates readme.md is 'Cumulative case counts'. My understanding is these are total infections (including undetected ones).
Firstly, these have an 'expiration date', and date when the estimate was done. Also what is being estimated varies widely across the papers. So would be better to have something like 'reporting rate' or 'detection rate', which has better utility for model-builders.
covid141,NA,medRxiv,ascertainment rate,NA,proportion,Japan,Unspecified,country,Unspecified,2020-28-02,Unspecified,0.44,confidence level
Date is formatted as YDM rather than YMD as all other dates. (2020-28-02 instead of 2020-02-28)
There is a collection of sources for data on testing at the following page:
https://ourworldindata.org/coronavirus-testing-source-data
if this sort of data could be included that would be useful to understand the observation process.
The dispersion in the incubation period is listed under the "Dispersion" category, which I believe is meant to show dispersion in transmission.
As it is, it's not clear whether the value for any of the estimates (but I was looking in particular at the incubation period) are means or medians of distributions or exactly what the value represents.
Amazing work aggregating the parameter estimates, but for the casual viewer it would be nice to be able to /see/ these values. Perhaps something like this figure could be added to the README.
The code for this figure should be easy to adapt to the other parameters as well if you are interested.
library(dplyr)
library(reshape2)
library(ggplot2)
country_names <- c("China", "Iran", "Singapore")
x <- read.csv("estimates.csv",
stringsAsFactors = FALSE,
header = TRUE) %>%
select(id,
peer_review,
name,
abbreviation,
units,
country,
value,
lower_bound,
upper_bound,
title_publication) %>%
filter(abbreviation == "R0") %>%
filter(lower_bound != "Unspecified",
value != "Unspecified",
country != "Unspecified") %>%
mutate(value = as.numeric(value),
lower_bound = as.numeric(lower_bound),
upper_bound = as.numeric(upper_bound))
id_order <- x$id[sort.int(x$value, index.return = TRUE)$ix]
nice_theme <- theme(
panel.background = element_blank(),
panel.grid.minor.y = element_blank(),
axis.line = element_line(colour = "black"),
axis.title = element_text(size = 22),
axis.text = element_text(size = 16),
plot.title = element_text(size = 32),
plot.subtitle = element_text(size = 22),
legend.background = element_rect(colour = "black"),
legend.title = element_text(size = 22),
legend.text = element_text(size = 16),
legend.key = element_rect(fill = "white")
)
plot_df <- x
plot_df$plot_id <- factor(plot_df$id, levels = id_order)
plot_df$plot_peer_review <- sapply(plot_df$peer_review, is.na)
ggplot(plot_df,
aes(x = plot_id,
y = value,
ymin = lower_bound,
ymax = upper_bound,
colour = country,
shape = plot_peer_review)) +
geom_pointrange() +
geom_hline(yintercept = 1,
linetype = "dashed") +
labs(x = "Estimate Identifier",
y = "Estimate",
title = "R-naught",
subtitle = "Basic Reproduction Number",
colour = "Country",
shape = "Peer Review\nStatus") +
coord_flip() +
nice_theme
## scale_factor <- 2
## ggsave("demo.png",
## height = scale_factor * 14.8,
## width = scale_factor * 10.5,
## units = "cm")
Source : https://en.wikipedia.org/wiki/Template:Notable_flu_pandemics
For R0 : https://en.wikipedia.org/wiki/Template:Notable_flu_pandemics#cite_note-4
See also your section.
On https://github.com/midas-network/COVID-19/blob/master/software_tools/software_catalog.csv
can you help change creator_contact for both of NSSAC dashboards to [email protected]?
If you don't mind, please update the version for our surveillance dashboard to 0.8.6.
General question:
Are these parameters being sourced on a pull request basis or is someone actively reading through the literature to get the estimates?
Because over the past few days, I've seen a few China CDC reports which are very comprehensive. It would be useful to include them if they're not already there. For some of these estimates, having an 'N' (number of patients) will be useful and somewhat essential when we try to use the most reliable one.
Finally, I am more comfortable with seeing 'XYZ et al.' as a way of citing them rather than the University (since a lot of the research is collaborative).
nCoV-2019 Situation Reports from Johns Hopkins University Center for Health Security (metadata)
The metadata link is leading to 404 error
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.