Giter Site home page Giter Site logo

fhir-sandbox's Introduction

fhir-sandbox

This repository contains tools for building a prepopulated FHIR R4 server, intended for the construction of sandbox environments for Electronic Health Records (EHR). fhir-sandbox creates a FHIR server inside a docker container. The FHIR server contains Patients and Observations from a user defined data source. Currently, a handful of observation types are supported such as heart rate and some mechanical ventilation parameters. The full list of supported observation types can be found in observation_types.json.

Initial Setup

  1. Ensure that Docker and Git are installed.
  2. Clone this project.
    git clone [email protected]:KitwareMedical/fhir-sandbox.git
  3. Install the required pacakages (python verion >= 3.9 required):
    pip install -r requirements.txt
  4. Download the empty hapi r4 container by the SMART on FHIR team.
    docker pull smartonfhir/hapi-5:r4-empty
  5. Run the docker container.
    docker run -dp 3000:8080 smartonfhir/hapi-5:r4-empty
    The port "3000" may be replaced by your choice of port; just replace appearances of "3000" by your choice in the rest of the instructions.
  6. Verify that the server is working by visiting http://localhost:3000/hapi-fhir-jpaserver/fhir/Patient. This should display some json.

Using fhir-sandbox

With the initial setup finished, there are two ways to use fhir-sandbox:

  1. Use one of the existing data sources, such as the random data generator used for testing, or the data source that takes NICU data from a downloaded MIMIC-III dataset.
  2. Create your own PatientDataSource to populate custom patient and observation data into a FHIR server.

Random data

This approach can be used to test basic functionality by populating a FHIR server with random data:

python populate_fhir_server.py --json_file ./data_sources/random.json
    --fhir_server http://localhost:3000/hapi-fhir-jpaserver/fhir/    

The amount of data to generate can be configured in data_sources/random.json.

MIMIC-III

This approach uses a downloaded MIMIC-III dataset to populate a FHIR server with NICU patients and ventilator observations. Note that the MIMIC-III dataset requires credentialed access on PhysioNet.

  1. Get access to and download MIMIC-III.
  2. Configure data_sources/mimic3.json to point to the location of your downloaded data and the MIMIC-III schema files:
    "args":
    {
        "mimic3_data_dir": "path/to/mimic3/data/dir",
        "mimic3_schemas_dir": "path/to/mimic3/schema/dir"
    }
    The schema files are provided in this repository here.
  3. Run populate_fhir_server.py:
    python populate_fhir_server.py --json_file ./data_sources/mimic3.json
    	--fhir_server http://localhost:3000/hapi-fhir-jpaserver/fhir/    

Custom data source

To populate the FHIR server with custom data, a bit of Python is needed. The procedure is to subclass the PatientDataSource, Patient, and Observation classes in data_sources/patient_data_source.py, specifying how the patient and observation data should be created.

A more in-depth explanation with examples can be found here.

Adding Custom Observation Types

In cases where the custom data source has additional observation types not supported in observation_types.json, it is possible to add new observation types by simply adding an entry to observation_types.json:

"ObservationTypeName":
{
    "display_string": "...",
    "unit_code": "...",
    "loinc_code": "..."
}
  • The display_string is a a human readable description of the observation type, e.g. "heart rate".
  • The unit_code describes the observation's unit of measure in the format of the UCUM system
  • The loinc_code is a LOINC code identifying the observation type. You can search for LOINC codes here.

Limitations

This tool only works with only two out of the many types of FHIR resources: Patients and Observations. Extending this tool to work with other types of FHIR resources requires a more involved development effort, as well as getting friendly with the FHIR documentation.

References

Johnson, A., Pollard, T., & Mark, R. (2016). MIMIC-III Clinical Database (version 1.4). PhysioNet. https://doi.org/10.13026/C2XW26.

Acknowledgements

This work was supported by the National Institutes of Health under Award Number R42HL145669. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

fhir-sandbox's People

Contributors

ebrahimebrahim avatar stephencrowell avatar

Watchers

 avatar  avatar  avatar

Forkers

stephencrowell

fhir-sandbox's Issues

streamline adding new observation types

Need to think about how to approach this.

One thing I am noticing is the way information is grouped in UNIT_CODES, LOINC_CODES, and DISPLAY_STRINGS, for example. Right now if I wanted to add a new observation type I'd have to go touch each of those and add my entry. It would be better if things were grouped such that the changes I need to make are more localized.

Even better if we can come up with a way to add custom observation types without touching the main code.
Similarly to #6, it would be ideal if users can use our tool directly rather than having to fork it if they want to add their own observation type. We need to think about how we can design it this way.

add synthetic data source options

from @andinet :

Synthea is a synthetic EHR data generator

https://github.com/synthetichealth/synthea

It can export data in HL7 FHIR format.

Here is the original paper that talks about the simulator
https://academic.oup.com/jamia/article/25/3/230/4098271


more comments by me:

Synthea does not create a fhir server for you, but rather can write json to your disk that looks exactly like what you'd get if you looked up patient data that was served via the FHIR protocol (see here). It's as though you had a FHIR server with the data in it, queried the server to get some json representations of the FHIR resources, and then saved that text json to disk. So the tool we are creating still has value added beyond what Synthea does, because our tool sets up a containerized server for you. However, our tool is less ambitious than Synthea in that we are (currently) only supporting Patients and Observations (compare to the giant list of FHIR resources seen here again).

So in conclusion I do believe it will be valuable, perhaps later, to incorporate synthea as one form of data generation that our tool can use.

rename project

Rename this project and this repo to something not LungAIR specific.

Purpose of this project is to populate sandbox fhir servers with custom data. Nothing to do with LungAIR.
What is a good name that captures this objective?

add mimic-iv on fhir data source

https://physionet.org/content/mimic-iv-fhir-demo/2.0/

Contains the data from 100 patients taken from MIMIC-IV, but organized in a FHIR-like format.
It's FHIR-like data exported to json files that you can download -- then it'd be up to us to serve the json as if one is reading a real FHIR server.
Would definitely be a good data source option to add to our fhir-sandbox software if we had time/interest later.

improve random data generation

The basic random data option could be made a lot more useful with a pretty small amount of effort:

  • generating data in a sensible way for each observation, sampling normal distributions with sensible means, etc
  • having a random patient class with some randomizable parameters that determine the distribution that will be sampled to generate observations for that patient.

do not use dynamic importing for customization

The way we currently import custom data sources is not great.

We should really make fhir-sandbox a proper python package that provides populate_fhir_server as a script, and we should approach the problem of adding a custom data source in a more proper way, treating fhir-sandbox as an extensible API.

One example of an issue this method of customization causes is that the module containing the custom data source cannot do relative imports. There is a fix suggested here but there are so many other issues that crop up that we should really rethink the current design.

In another project we addressed packaging and customization properly, so we can follow that as a model.

improve readme

  • intro blurb of what this is
  • getting, configuring, and running the docker image
  • basic usage (use our premade data sources)
  • custom data source
    • what abstract classes to provide implementation for
    • minimal example
  • custom observation type
    • point to the right place
  • what if user wants other FHIR resources besides patient, observation? just point out that this is a limitation and point to fhir docs

streamline adding new data sources

Messing with our argparse should not be necessary for a user of our API who wants to add a custom data source.
In fact it's best if they don't have to touch populate_fhir_server.py at all.

Options:

  • make it possible to somehow "register" the new data source once it's created, so that it becomes an option when running populate_fhir_server.py.
  • make populate_fhir_server.py take an argument that somehow points to the custom classes

For a custom data source it would be nice to be able to tell users "create your data source definition like this and then feed it into our tool", rather than saying "fork this project and make these changes."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.