Drug consumption prediction for NHS

Done as a project at AIHack 2020 at Imperial College London.

Motivation

As life expectancy increases, it is crucial to improve the quality of life among the elderly. As people age, they are more prone to having health issues and degenerative diseases such as cancer, dementia, infections due to viruses and bacteria. Osteoporosis and arthritis affecting the bones and joints, neurodegenerative diseases such as Alzheimer’s and Parkinson's, cancer, and diabetes are among the most common degenerative disease.

Our project is aimed at measuring correlations between different diseases in regions in the UK. We buolt a dataset consisting of time-series data consisting of the number of prescriptions per GP in the UK and we attempt to predict thwat is going to be the monthly consumption for specific medicine per GP.

The main contributions of this projects are:

Dataset
Method
Visualisation

To run our project proceed to Running and the contributions are all summarised in train.ipynb inside the root of the project.

Dataset

The initial datased consisted of a meta-data file mapping medical procedure codes to descriptions accompanied with a unique CSV for each medical process that contained over ~5 years worth of prescription counts for every GP in the UK that would prescribe that procedure at least once. Our contribution is deemed in processing the dataset and extrapolating non-existent values into a time-series dataset consisting of prescription data from more than ~6000 GPs in the UK for the thirty most-commonly prescribed medical processes over a ~5 year period.

Method

Our method is a multi-layered Recurrent neural network that is trained through backpropagation. We are able to estimate the uncertainty of our predictions on the procedure counts through Bayesian inference that is provided through using MC Dropout.

Visualisation

Visualisation was carried out by looking at spatiotemporal variation for the drug quantities along with correlation. Time density plots and stationarity analysis along with a VaR model were used for prelimninary analysis. A heatmap was created that showed variation by region using folium. The collected results and insights can be seen in Stats analysis of dementia drugs by region.docx.

All our conclusions can be found under presenation.pdf.

Running

Requirements

The requirements to run the code are in requirements.txt. The required libraries are:

numpy
ipykernel
scipy
sklearn
seaborn
pytorch

To install them create a virtual environemnt for Python>=3.6 with:

virtualenv -p python3 venv
source venv/bin/activate

And install the dependecies with:

pip3 install -r requirements.txt

Running

Running is just as easy. Make sure that you have insalled all the libraries and then the tutorial is in tutorial.ipynb. You can run it comfortably in a Jupyter Notebook environment with:

jupyter notebook

pancakewaffles / aihack2020 Goto Github PK

aihack2020's Introduction

Drug consumption prediction for NHS

Motivation

Dataset

Method

Visualisation

Running

Requirements

Running

aihack2020's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent