Giter Site home page Giter Site logo

drkostas / covid19-vaccinations-predict Goto Github PK

View Code? Open in Web Editor NEW
8.0 3.0 0.0 15.64 MB

Simultaneous Time Series Forecasting on the global COVID-19 Daily Vaccinations

License: MIT License

Makefile 0.21% Python 4.25% Jupyter Notebook 95.54%
covid-19 vaccination-data prediction-model tensorflow lstm time-series-analysis multivariate-timeseries

covid19-vaccinations-predict's Introduction

Simultaneous Time Series Forecasting on the World's COVID-19 Daily Vaccinations

GitHub license

Table of Contents

About

Dataset: COVID-19 World Vaccination Progress
This is my project for the Data Mining Course (COSC-526). The main code is in this Jupyter Notebook.

Code Locations

  • The dataset is in the datasets/covid-world-vaccinations-progress directory
  • The metadata dataset is in the datasets/countries-of-the-world directory
  • The jupyter notebook used is the project.ipynb
  • Some custom packages used in the notebook are located in the data_mining directory:
    • Project Utils:
      • NullsFixer: for inferring the nulls in the COVID-19 vaccination dataset
      • Preprocess: the preprocessing code of the dataset before training
      • BuildModel: contains all the functions related to the building of the TF model
      • Visualizer: the implementations of all the visualizations
    • Configuration: it handles the yml configuration
    • ColorizedLogger: code for formatted logging that saves output in log files
    • timeit: ContextManager+Decorator for timing functions and code blocks
  • The project was compiled using my Template Cookiecutter project: https://github.com/drkostas/starter

Document Locations

The extended abstract and the poster are both located in the Documents folder.

Information About The Dataset

The COVID-19 Vaccination Progress Dataset contains information about the daily and total vaccinations of 193 different countries over 135 different dates. The data are being collected almost daily and of writing this (4/29), the dataset has 14230 rows and 15 different features.

The features of the dataset are the following:

  • Country: this is the country for which the vaccination information is provided
  • Country ISO Code: ISO code for the country
  • Date: date for the data entry; for some dates we have only the daily vaccinations, for others, only the (cumulative) total
  • Total number of vaccinations: this is the absolute number of total immunizations in the country
  • Total number of people vaccinated: a person, depending on the immunization scheme, will receive one or more (typically 2) vaccines; at a certain moment, the number of vaccination might be larger than the number of people
  • Total number of people fully vaccinated: this is the number of people that received the entire set of immunization according to the immunization scheme (typically 2); at a certain moment in time, there might be a certain number of people that received one vaccine and another number (smaller) of people that received all vaccines in the scheme
  • Daily vaccinations (raw): for a certain data entry, the number of vaccination for that date/country
  • Daily vaccinations: for a certain data entry, the number of vaccination for that date/country
  • Total vaccinations per hundred: ratio (in percent) between vaccination number and total population up to the date in the country
  • Total number of people vaccinated per hundred: ratio (in percent) between population immunized and total population up to the date in the country
  • Total number of people fully vaccinated per hundred: ratio (in percent) between population fully immunized and total population up to the date in the country
  • Number of vaccinations per day: number of daily vaccination for that day and country
  • Daily vaccinations per million: ratio (in ppm) between vaccination number and total population for the current date in the country
  • Vaccines used in the country: total number of vaccines used in the country (up to date)
  • Source name: source of the information (national authority, international organization, local organization etc.)
  • Source website: website of the source of information

For recalculating the per hundred people values we used another dataset that contains some metadata about the countries of the world, including their population.
Metadata Dataset: DataBank - World Development Indicators

Getting Started

These instructions will get you a copy of the project up and running on your machine.

Prerequisites

You need to have a machine with Python >= 3.6 and any Bash based shell (e.g. zsh) installed.

$ python3.6 -V
Python 3.6.13

$ echo $SHELL
/usr/bin/zsh

Setting Up

All the installation steps are being handled by the Makefile. The server=local flag basically specifies that you want to use conda instead of venv, and it can be changed easily in the lines #25-28. local is also the default flag, so you can omit it.

$ make install server=local

To update the COVID-19 vaccination dataset with the latest information, run:

$ make download_dataset server=local

Running the code

In order to run the code, you will only need to modify the yml file if you need to, and open a jupyter server.

Modifying the Configuration

There is an already configured yml file under confs/covid.yml with the following structure:

tag: project
covid-progress:
  - properties:
      data_path: datasets/covid-world-vaccination-progress/country_vaccinations.csv
      data_extra_path: datasets/world-bank/data.csv
      log_path: logs/covid_progress.log
    type: csv

Running Jupyter

After loading the cond environment with the command conda activate data_mining, run jupyter notebook and open the project.ipynb file.

TODO

Read the TODO to see the current task list.

Built With

  • Jupyter - An interactive computing framework
  • Tensorflow - A deep learning framework

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

covid19-vaccinations-predict's People

Contributors

drkostas avatar ilumsden avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.