Add plots

Prediction trajectories for test period, with hovering to see one trajectory
SEG
- Optional labels for workouts
Plot one prediction, and test editing the inputs with sliders

Add parsers and parameters

Add nighscout parser:

Specify start and end dates
Verify that the results compared to tidepool parser are equal

Tidepool and nighscout:

Add workouts as output

Other

Add apple health parser
Include extra paramteres like steps etc.
Add oura ring parser

things like example.py we should do within python nb environment

Lets do the plotting stuff in ipython notebooks, no? Then the plots will be visible in GitHub for all to see, among other benefits.

Generate config command

Enhance the generate_config command in the CLI:

Now, if there are some input errors in the prompts, this will not be identified until the user has answered all of the questions, which might be quite frustrating
Add input validation
Also, if there is an error, the user has to start all over again. Fix so that there is a second chance to write again instead.
Option to use either prompts or command line interface
Update README accordingly

The solution can be to give explanatory error messages immediately after an input error, and reprompting the input field to let users try again from where they left off.

Add a command line interface

Add a command line interface used to

add tidepool credentials
define time periods when running tests
?

Blood Glucose Prediction - Research Platform

Tasks remaining to finish the tool:

We will build heavily on the benchmarkpaper for this tool, to test it on real life scenarios.

Preparations

Create a research branch, from which you will use as base branch for this work
Remove the irrelevant code (models, metrics, documentation, example files)

Parsers
Summary: All the parsers should do the same: return a "raw dataframe" with all of the datatypes in the same time grid, but without any fuzz.

Rewrite tidepool parser to do this
Rewrite nightscout parser to do this
Create an apple health parser since it has heartrate as input
Create an Oura Ring parser?
On call, save the dataframe in the data/raw repository (title metadata: start and end date, parser used, username?)
Start making the CLI: Add an option to parse data
Document: Nighscout cannot provide workouts, tidepool can, document which dataypes from apple parser

Preprocessors

Models

Evaluation

Make sure they are all implemented, conventional ones.
- ESOD
- TG
Handle comparisons and single ones. Store results in files!
Plots: Handle comparisons

Real-time plots:

Adding plots with solutions to predict one specific trajectory, either in real time or for a given date (with "true" measurements alongside).
Adding interactive plot?

Settings

Use global settings for unit conversion
Implement all relevant files to have this as a boolean input. Fetching from config in the main script.

Command line interface

Main menu: Settings or model
Choose either preprocessing or existing dataset
Choose either model training or pretrained model
Choose evaluation metrics

Documentation

Code improvements (feedback from chatgpt):

Add default values to config-file
Dynamic choices for click-options (listing alternatives from modules)
Check for potential exceptions (input verification), and write clear error messages
Output verbose information: adding --verbose flag to describe cli commands
add helper methods in a separate file to the CLI to keep the CLI file clean
Add unit testing
Consistency: When using models/metrics etc, make sure you use the same type of CLI-argument
Duplicate code: Find a way to fix that. It is in processors, model training ...

Optimize performance (add vectorization instead of loops)

The scripts run very slowly, and some optimisation should be performed in:

penalty_math.py
loop_model_scoring/tidepool_parser.py
Might be necessary to also do some adjustments in the pyloopkit fork

Before we start on this we should measure which processes that are most time consuming.

Add metrics: Variability Impact Index and the Prediction Consistency Index

Add metrics:

https://journals.sagepub.com/doi/10.1177/19322968211042621
Postprandial glucose: iAUC and netAUC

Handle glucose units

Find a solution to more elegantly handle glucose units mmol/L vs mg/dL

Loop Algorithm

Keep working on the loop algorithm branch:

Add tests to make sure the predictions are correct
What is the rmse? How do the trajectories and the scatter plots look? And other metrics?

Model training with trajectories

Update the repo to use whole trajectories in model training.

How it works now:
Data goes through a preprocessor that adds one target column (given by the prediction horizon). Then, for each prediction horizon, one model is trained. Models assume that there is a target column named "target", and will use this name convention to "find" the target.

My thoughts about the refactor:

In the preprocessor, all the target columns defined in the configuration should be added
I suggest that target columns are named for example "target_{prediction_horizon}", and that the naming convention can be used to "find" the prediction horizons when necessary
Models must be refactored so that each model includes all the prediction horizons, instead of how it is now where each prediction horizon has its separate model. I imagine the solution (if the model does not have multiple output inherently) is that the model class stores a dictionary with the prediction horizon as key, and the model as the data
When using the "predict" method, we need to decide on a standardized format of the output - either an array/dict of the predictions for each prediction horizon, or using the prediction horizon as an input in the predict method, or one function for each of the options.
The CLI for calculate_metrics and plots will need some adjustments to work with the refactor
Update the README as needed

Need install script

The install steps are outlined in clear enough terms in the readme - no reason the user should have to do each thing independently. Let's make a manual install.sh script (idk if this is the best name, but it will be a bash script), that runs everything at once and/or in chunks as needed by setup.py. This will also be useful for organizational purposes when we make the whole thing installable via pip, which it looks like you've started with (b/c you have setup.py).

Model Evaluation

GluPredKit refactor for enhanced and strict model evaluation:

Refactor:

Configuration:

Number of time lagged events should be specified for each feature (list instead of integer)
Configuration has one prediction horizon
Test size should be removed to parsers, CLI command instead. In that way, we can make preprocessors properly plug-and-play
Maybe the type of numerical transformation should be an input to the configuration instead of creating a new preprocessor for every time you want to use something different

Generate pdf reports for model evaluation:

Create CLI command to generate a report
Add a text at top or bottom that refers to GluPredKit ("This report is generated using GluPredKit...")
Add dataset information
Add configuration information
Add traditional evaluation metrics
Add run time evaluation
Insulin and carbohdyrate effect evaluation: Use PCP or the other number (S..). Idea for Partial Dependence Plot: x-axis is prediction horizon, y axis is predicted blood glucose, but then we can have many lines with labels like (UI = 1, UI=2...).
Generate a report that compare models - have in mind that datasets usually will contain many subjects.

Models:

Delete models that don't fit our requirements? Or define included models based on inclusion criterion. Maybe models from the literature that have included their source code for example? Use Doyoungs work as help?
Uva padova model
T1D loop model
Tidepool Loop model

Parsers:

Parsers should process data with the new set of features that we agree upon with Sam
Add OpenAPS as well
Should assume that all the dataset subjects are in the dataset initially
Parsers should already have separated train and test data to avoid data leakage
- Remove test data fraction from configuration
- Add it to parsing CLI command instead?
Make sure that all parsers use the same unit for basal rates

Metrics:

Add MARD number

Testing:

Make sure that all components have tests and that running the tests is working

General improvements:

Be clearer about the expected input and output formats for each module
Add citation information in README

Checklist before merge

Clean up the code properly
Make sure that all the CLI commands are working on both Win and Mac
Run tests
Update README to reflect all of the changes

Loop parameter optimization

Jupyter notebook (?):

Plot in 3d penalty for carb ratios and linear combination of basal and ISF
Given an optimal carb ratio from the last plot, plot in 3D penalty for basal and ISF
Use test- and training data

Further work:

Explore for time on day

Rename repository

The repository should have a new name - more descriptive of the purpose which is to evaluate any kind of blood glucose prediction based on any kind of penalty function. Suggestions:

bg-forecast-evaluator
GlucoPredEvalKit

lstm pytorch output

The forward method in lstm pytorch should not be squeesed. this leads to the error: UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

Interactive plots

Add a possibility in the scatter plot to allow/disallow certain models so you can interactively choose which models to compare.

Shouldn't the UN and PW arguments be optional?

My understanding is that the apple_health parser should instead take a filename as an argument, but it complains that no USERNAME is supplied:

GluPredKit/glupredkit/cli.py

Line 72 in d8ee553

@click.argument('username', type=str)

Documentation: Defining prerequisites and Video Demonstration

To enhance usability even further, do the following:

Mention python (w/versions) and pip as prerequisites in the documentation
Post a video where you demonstrate the usage of the system
- This video should have a clear division between the installation, setup, and features in the system
  - Parsers, try each one!
- List the times for each section so they can easily choose which part

Fix loop prediction error

The first step to make sure that we are using pyloopkit correctly and fetching correct predictions, is to run the example_nightscout.py and verify with out Loop app in real time. Currently the predictions are slightly off.

Possible solutions:

When fetching data from nightscout, scheduled basal rate deliveries are not written as a treatment, so we have to manually add those samples in the dataframe. This is a TO DO!
There might be a time offset problem, because Python does in some situations convert automatically to your local timezone
I see in the plot that it seems like the trajectories start from a different time, just sligthly. maybe some are using glucose as reference, while others use "now-time"
Basal rates in U and not U/hr? I am confident that the units written in the repo are correct, but maybe pyloopkit is expecting U.
Basal rates giving string instead of number to pyloopkit?
Have I included enough hours of data for retrospective correction?
The newest basal rates are not registered in data, because new basal rates are written when changed (after checking, this did not make a difference on predictions and hence is not a problem (when adding sheduled). However, if the current is temp, it might make a difference.)
When we compare data between nightsout and tidepool we see that the basal data is sligtly different: tidepool has derived the basal rate from the delivered units and duration of sample, while nightscout has directly used the “programmed basal” feature, which is not perfectly correlated with delivered basal in units, probably because of the basal delivery algorithm of the hardware. Which one does pyloopkit use for predictions? Programmed basal or delivered basal?

Trajectories Plot Enhancements

Currently, the trajectories plot is cutting off all predictions that dont have corresponding measurements to all of the trajectory. Should this be changed?

Choose between Prompts and cli commands "generate_config"

Currently in the generate_config command, it is only possible to use the promts to generate configurations. It should also be an option to generate configs programmatically.

Add options to generate configs programatically
Update README accordingly
How to handle default values in that case? Is it not possible?

Entry Point Bug Fix

The pip install of GluPredKit does not create an executable entry point.

I have:

Added #!/usr/bin/env python3 to cli.py
Ran command to make the file an executable: chmod +x glupredkit/cli.py
Added entry-point in setup.py

Resources:

https://setuptools.pypa.io/en/latest/userguide/entry_point.html

Refactoring - Adding configuration to the model training stage

Currently there is configuration for that data processing. This configuration may be used with any model.

However, models might also have many different configurations, like hyperparameters or therapy settings. These can be cumbersome to define in a CLI, and should probably be added as configuration files as well.

This means a refactor that:

Update the CLI generate_folders to generate the necessary folders
Update the CLI to generate model configurations (?)
Update the CLI for model training to use the optional, defined configuration
Update figures and documentation to explain it well

Smallfixes

Handle units:

Parsers should always return data in mg/dl
Plots, metrics etc can have a flag for mmol/l

y_pred, y_true consistency In models, metrics etc:

fix that the order and naming convention of y_pred and y_true is the same

rename repository:

imagine something you would like to type, without dashes, assosications to bg-prediction/forecasting/evaluation, cute

Do we need all these inference dependencies?

I see that the install script downloads tensorflow, keras, xgboost, and probably more.
I imagine these are probably used in some of the examples, but are these strictly necessary? If the idea is for our library to be agnostic to what tools the user prefers, it should probably be more lightweight and not require you to download 250MB of code/data you do not need.
IRL use-case example: I like PyTorch for DNNs, not tensorflow. If I were to deploy this the build script is gonna be slowed down significantly by the tensorflow requirement even though im not going to use it.

"What-if" events

Adjusting GluPredKit to pave the way for making "what-if"-events the common choice in model training:

Adjust the preprocessors with the option to include the what-if-events
Adjust the generate_configuration in CLI:
- Add a "use_what_if" flag (or something like that)
Update the README where relevant
Test that it works correctly

Add a preprocessors

Create a helper class for preprocessing:

Need to finish making the whole thing installable from scratch with pip

Will make adoption WAY easier if in the readme we can just say

installation

pip install glupredkit or pip install git+git://github.com/miriamkw/GluPredKit

This involves a few steps

make sure setup.py includes all aspects of installation and it works
create a pyproject.toml as per https://packaging.python.org/tutorials/packaging-projects/ that includes authors and admin stuff
(optional) create an account on https://pypi.org, give access to whoever is an admin for project
(optional) upload the code to pypi.org (this can be done in an automated way via GitHub actions, or manually from the command line)

Basic tutorial: https://packaging.python.org/tutorials/packaging-projects/ that includes these steps.

We should make sure it works via pip install glupredkit as well as pip install git+git://github.com/miriamkw/GluPredKit, probably w/ initial emphasis in the second one, since that doesnt require doing any uploads to pypi. People will use this both to gain access to the framework within python and from the command line. So python -m glupredkit <command> will become glupredkit <command>.

Ohio T1DM Parser

Ohio T1DM fixes:

The parser does currently not work with the 2020 version (because of empty heart rate dataframes)
Add in the CLI an alternative to parse all the datasets instead of one and one. There should also be a more intuitive default output name.
Add more datatypes to the final processed dataset
Update README accordingly
Clean up/perfection the code

Add "helpers"

Helper methods/classes to process data:

Refactoring

Evaluator:

Create a base class with the mandatory functions and properties
Implement a class for each "penalty function" or evaluator
Be thoughtful about naming conventions
Go through the literature and look for different metrics
Document the different metrics with explanations of what they capture and references to literature if relevant. You could also mention some weaknesses to the specific metric.
Support that y_pred and y_target are matrixes with different prediction horizons, since some metrics are dependent on that.

Prediction models:

Create a base class with the mandatory functions and properties
Implement a class for each prediction approach.
- Highest priority is the loop algorithm.
- Linear regreesion
Be thoughtful about naming conventions
Should have a fit() method and a predict() method. Output should be defined somehow. Take inspiration from scikit learn.
Document the different examples and instructions to how people can implement their own algorithms. Write down some disclaimers about what is expected from the user to handle (like information leakage between train and test data).

tidepool_parser.py:

tidepool_parser.py and other code has no unused code, and that the code is written efficiently
Add into a folder "parsers" as we later might add a nightscout_parser or tandem_parser etc.
Any unused code that can be removed?
This file should not handle prediction using pyloopkit. This file should solemnly process tidepool data into the format that we feed the model base class with
Think thoroughly through how the data input format should be. Json? Dataframes? Which metadata will be included?
Run a profiler, improve efficiency of the file
Document the format of the output data from the tidepool_parser (or any other potential data source), which will be the input data to each model.

Example scripts:

Make example scripts work with the refactored code/create new examples

Tests:

Write tests for all metrics
Write tests for all prediction models
Add tests to the test_all.py for all implemented metrics and models
Write in documentation how to run the tests, and why it is a good way for people to check whether they implemented correctly

Cleanup:

Delete outdated files and folders

Plots (low priority):

Make a module for plotting results into for example SEG

y_pred y_true consistency

In models, metrics etc:

fix that the order and naming convention of y_pred and y_true is the same
all models should have the same output: a list of trajectories
metrics expect the list of trajectories, but output_offset can be defined

Specify start and end dates
Verify that the results compared to tidepool parser are equal

Milestones Week 43

Refactoring:

Add real-time predictions (after config, so you dont need to refactor):

Simply added a new plot that plots trajectories. This will be in real time if the input data is up to date
Plot trajectory(/ies)
Define either date "now", or a single point in time
Use names

Update CLI:

Create config
Use config instead of preprocessor to train models (and remove preprocess)
Evaluate models should use config as well
Separate commands: "draw_plots", "calculate_metrics"
Real-time predictions

Documentation:

Finish:

After rigorous testing of all functionality and usability test
upload again to PyPi
merge into main

Create data directory dynamically

As discussed, create the data directory as needed in the directory where the cli is being installed/used

only create it when it is required and an existing one does not exist.
Depending on how often the user is manipulating the contents directly, consider making it a .dot directory so it is invisible and doesn't clutter the user's file system (like the similar .git directory)
print out what is going on as it happens, (optionally using input() to confirm). The user needs to know that this folder is being created when it happens, and exactly where they can find it.

Penalty between two dates

Finish current script:

Calculate penalty for each prediction pair in trajectory
Plot results for one prediction pair
Plot results for trajectory

Create a script that:

Calculates the penalty between two dates given an algorithm
Implement more algorithms, including RMSE, MAE and ME
Plot the penalty for each measurement

Before merge:

Update documentation (explain example scripts)
Go over code and remove printing and outdated comments

miriamkw / glupredkit Goto Github PK

glupredkit's People

Contributors

Stargazers

Watchers

Forkers

glupredkit's Issues

installation

Recommend Projects

Recommend Topics

Recommend Org