Giter Site home page Giter Site logo

hst953_iomed's Introduction

Workshop - I/O and Medication

Wei-Hung Weng (ckbjimmy [at] mit [dot] edu)
12 October 2018

Objective

  • Know where to find the existing codes for similar problems
  • Understand basic SQL syntax
  • Extract fluid/medication information from both MIMIC/eICU databases

Resource

MIMIC-III

eICU

Basic SQL Syntax

SELECT [columns]
FROM   [table]
WHERE  [condition1 AND (condition2 OR condition3)]
GROUP BY [columns]
ORDER BY [columns] [ASC/DESC]

JOIN

  • LEFT JOIN
  • INNER JOIN
  • RIGHT JOIN
SELECT [columns]
FROM   [table1] t1
LEFT JOIN [table2] t2
ON t1.[column] = t2.[column]
WHERE t1.[condition1]

WITH

The syntax WITH can create a temporary table to break a large SQL query to smaller pieces. Here is a simple case to demonstrate how to use it. You can also create materialized views in your database, which you can literally use those views as if they are real tables.

WITH dates AS (
  SELECT subject_id, admittime, deathtime, deathtime-admittime AS los
  FROM admissions
  WHERE deathtime IS NOT NULL
)
SELECT subject_id, los
FROM dates

Exercise

Preparation

In the workshop, we use the MIMIC and eICU databases located on Google Cloud Platform, and connect to them through the R package bigrquery. For demonstration, we just extract the data from demo dataset (mimiciii_demo and eicu_crd_demo). Please modify the path of dataset in R and SQL scripts into the full dataset (mimiciii_clinical and eicu_crd) for your projects.

# Establishing GCP connection
library(bigrquery)

project_id <- "hst-953-2018"
options(httr_oauth_cache=FALSE)

run_query <- function(query){
  data <- query_exec(query, project=project_id, use_legacy_sql=FALSE)
  return(data)
}

# test and get auth code
run_query(paste0("SELECT count(1) FROM `physionet-data.mimiciii_demo`.`patients`"))

To make our files neat and organized, we put all SQL scripts under the folder ./sql/. We will use the following get_sql function to read the SQL code in the file, and pass it to the execution function run_query later.

get_sql <- function(f){
  con = file(f, "r")
  sql.string <- ""
  while (TRUE) {
    line <- readLines(con, n=1)
    if (length(line) == 0) {
      break
    }
    line <- gsub("\\t", " ", line)
    if (grepl("--", line) == TRUE) {
      line <- paste(sub("--", "/*", line), "*/")
    }
    sql.string <- paste(sql.string, line)
  }
  close(con)
  return(sql.string)
}

If you plan to use Python instead of R, the following chunk can help you do the same task as the above R code. You can also try colab version.

import os
import pandas as pd

from google.colab import auth
from google.cloud import bigquery

auth.authenticate_user()

project_id = 'hst-953-2018'
os.environ["GOOGLE_CLOUD_PROJECT"] = project_id

def run_query(query):
  return pd.io.gbq.read_gbq(query, 
                            project_id=project_id, 
                            verbose=False, 
                            configuration={'query':{'useLegacySql': False}})

def get_sql(f):
  file = open(f, 'r')
  s = " ".join(file.readlines())
  return s

df = run_query("SELECT * FROM `physionet-data.mimiciii_demo`.`patients` limit 10")
df

q = get_sql('./sql/bq-eicu-med.sql')
df = run_query(q)
df

In the following sections, we provide some MIMIC and eICU SQL codes related to our topics, intake/output and medication, which are adopted from MIT-LCP repository but modified for the databases on Google Cloud. For people who want to run the codes on local postgreSQL server, please go back to the GitHub repo and use the original SQL. (This is because that the Google BigQuery and PostgreSQL still have some differences in syntax.)

We thank Dr. Alistair Johnson, Dr. Tom Pollar, and other MIT LCP members for their effort constructing all these code repositories and sharing their queries to us.

Intake/Output

MIMIC-III

  1. Using outputevents table. The first example is to sum up the urine output using the outputevents table. There are two scripts doing the same task with slightly different code writing styles.
q <- get_sql("sql/bq-mimic-uo.sql")
res <- run_query(q)
head(res)

q <- get_sql("sql/bq-mimic-pivoted-uo.sql")
res2 <- run_query(q)
head(res2)
  1. Using chartevents table. The second example for MIMIC-III is to identify the duration of continuous renal replacement therapy (CRRT).
  • concepts/durations/crrt-durations.sql
  • You can use the itemid here to identify the items for hemodialysis. It summarizes all the item ids in chartevents, CareVue and MetaVision versions of coding.
q <- get_sql("sql/bq-mimic-crrt-durations.sql")
res <- run_query(q)
head(res)
  1. Using inputevents_mv and inputevents_cv tables. This example is to identify how many crystalloid bolus was given for each patient.
  • You can use the itemid here to identify the items for crystalloid fluid. Crystalloid fluid is the main component of measuring fluid input.
q <- get_sql("sql/bq-mimic-crystalloid-bolus.sql")
res <- run_query(q)
head(res)

eICU

  1. Using intakeoutput table. The goal of this query is to identify the daily fluid balance (output - input) for each ICU patient.
  • In intakeoutput table, we can simply identify intake/output items using string matching in cellpath column.
  • We thank Dr. Matthieu Komorowski providing the script
q <- get_sql("sql/bq-eicu-io.sql")
res <- run_query(q)
head(res)

Medication

MIMIC-III

  1. Using prescriptions table. In this example, we try to identify whether the antibiotics is used on each patient during their ICU stay (binary). We need to use string matching to identify if your target medications show up in the drug_name_generic column of prescriptions table.
  • You can use the medication list here for antibiotics use
  • You need to have clinicians (physicians or pharmacists) to help you ensure what medications are needed for your study. e.g. %lol% for beta-blockers
q <- get_sql("sql/bq-mimic-med.sql")
res <- run_query(q)
head(res)
  1. Using inputevents_mv and inputevents_cv tables. Please check the concepts/durations/ sql codes for computing dose and duration of infusion drugs. The way to modify the code is similar to the CRRT example above.

eICU

  1. Using medication table. This is almost the same as the query for MIMIC prescriptions table, since they both need to do string matching for identifying medications.
q <- get_sql("sql/bq-eicu-med.sql")
res <- run_query(q)
head(res)
  1. Using medication table but also using HICL codes. drughiclseqno column is considered in this case. In this example, we want to see whether the patient has received some specific infusion medications. It is still similar to the first case, but in slightly different coding style.
q <- get_sql("sql/bq-eicu-pivoted-med.sql")
res <- run_query(q)
head(res)
  1. Using infusiondrug table. This example is to identify the usage of infusion drugs (dopamine, dobutamine, norepinephrine, phenylephrine, vasopressin, milrinone, heparin) for each patient.
  • concepts/pivoted/pivoted-infusion.sql
  • You can use the identified strings of seven infusion drugs in the script as your infusion drug list.
  • You may compare the result of this table and medication table. For the formal analysis, you need to consider the union of the results from two tables.
q <- get_sql("sql/bq-eicu-pivoted-infusion.sql")
res <- run_query(q)
head(res)

hst953_iomed's People

Contributors

ckbjimmy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.