Giter Site home page Giter Site logo

rehr's Introduction

rPublicHealth

A collaborative project to build R tools facilitating access to public healthcare data

rehr's People

Contributors

daspringate avatar evank23 avatar rosap avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rehr's Issues

Number of years of history in CPRD

Count the number of years (or length of time) a patients has been in CPRD before entering the cohort (e.g. index date, study starts);

  • (the length of time should include only when the practice is up to standard)

Note: Sometimes this information is needed to assure to capture incidence and not prevalence patients of e.g. psoriasis, diabetes, etc..

Select patients with a specific diagnosis and specific drug/s

  1. It allows to identify patients with a specific medical code and a specific drug (e.g. patients with psoriasis and treatments for psoriasis);
  2. It might include calculation of patients who start treatment as mono-therapy, dual-therapy, etc.
    • (e.g. Calculation of patients with diabetes who starts treatments as mono-therapy (metformin) or dual-therapy (metformin + sulphonylurea);

Operation: Select patients in window with condition

  1. Select events matching clinical codes
  2. Find first diagnosis for a patient
  3. restrict to time window
  4. exclusions:
    • diagnosis made within x months of registration
    • patientids from exclusion criteria
    • sanity criteria: patients whose first diagnosis is after deathdate

Error in translate SQL

Hello,
I am trying to use the package to assign an index dummy date UK PCD

when I use the flat files function :
flat_files(db, out_dir = consultation_dir, file_type = "csv")
I get the following error
Error in translate_sql_(expand_string(where)) :
could not find function "translate_sql_"

If you could please help me with solving it

Is this package still supported?

I'm unable to install it unfortunately. I'm using R 3.1.2, the errors I'm getting are printed below. I wasn't able to find out what the exact problem is, I also shouldn't be having any trouble with rJava package (see one of the errors below) as it works fine for me in other places.
Is this package still supported at all?
Any help would be much appreciated, Thanks!!

* install_github("rOpenHealth/rEHR")*
Downloading GitHub repo rOpenHealth/rEHR@master
from URL https://api.github.com/repos/rOpenHealth/rEHR/zipball/master
Installing rEHR
"C:/PROGRA1/R/R-311.2/bin/x64/R" --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL
"C:/Users/efratm/AppData/Local/Temp/RtmpUTTolg/devtools1c084fb675b6/rOpenHealth-rEHR-360872d" --library="C:/Users/efratm/Documents/R/win-library/3.1"
--install-tests

  • installing source package 'rEHR' ...

It is recommended to use 'given' instead of 'middle'.

** R
** data
*** moving datasets to lazyload DB
** inst
** tests
** preparing package for lazy loading

Warning: package 'DBI' was built under R version 3.1.3
Warning: package 'dplyr' was built under R version 3.1.3
Warning: replacing previous import by 'stringr::%>%' when loading 'rEHR'

** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
*** arch - i386
Warning: package 'DBI' was built under R version 3.1.3
Warning: package 'dplyr' was built under R version 3.1.3
Warning: replacing previous import by 'stringr::%>%' when loading 'rEHR'
Error : .onLoad failed in loadNamespace() for 'rJava', details:
call: fun(libname, pkgname)
error: No CurrentVersion entry in Software/JavaSoft registry! Try re-installing Java and make sure R and Java have matching architectures.
Error: loading failed
Execution halted
*** arch - x64
Warning: package 'DBI' was built under R version 3.1.3
Warning: package 'dplyr' was built under R version 3.1.3
Warning: replacing previous import by 'stringr::%>%' when loading 'rEHR'
Using CPRD settings
ERROR: loading failed for 'i386'
*removing 'C:/Users/efratm/Documents/R/win-library/3.1/rEHR'

Error: Command failed (1)

Select_by_year throws error with empty categories

patpract_q <- wrap_sql_query("SELECT * FROM Patient JOIN Practice
ON Patient.practid = Practice.pracid")

temp_table(db, tab_name = "patpract", select_query = patpract_q)

denom_q <- "crd < STARTDATE & (is.na(tod) | tod > ENDDATE) & uts < STARTDATE & lcd > ENDDATE"

denoms <- select_by_year(db = db,
tables = "patpract",
columns = c("patid", "practid", "gender", "yob", "crd", "tod", "lcd", "uts"),
where = denom_q,
year_range = c(1987:2002),
year_fn = standard_years,
as_list = FALSE,
selector_fn = select_events,
cores = 1)

Using open database connection
Error in $<-.data.frame(*tmp*, "year", value = 1987L) :
replacement has 1 row, data has 0

Make select_by_year more generic

... so it can be used to, say have 3 monthly windows.

This may just be an argument naming issue and it may be better to change the data builder functions to generate more complex ranges of dates

Linking to Clinicalcodes.org

How cool could it be if you were looking for say 'diabetes' and the software could look up the code lists available on the website and return their, say, ids along with a short description. Then the user could input the code list id to define the cohort on the fly using the info on the website

Error with select_events()

Error in !strict : invalid argument type 
5 sql_env(x, variant, con, window = window) 
4 FUN(X[[1L]], ...) 
3 lapply(expr, function(x) {
    if (is.atomic(x)) 
        return(escape(x, con = con))
    env <- sql_env(x, variant, con, window = window)
    eval(x, envir = env)
}) 
2 translate_sql_q(parse(text = paste(lapply(e, function(x) {
    if (str_detect(x, "^\\.")) {
        eval(parse(text = str_match(x, "\\.(.+)")[2]))
    }
    else x
}), collapse = " "))) at select_events.R#32
1 select_events(db, tab = "Referral", columns = c("patid", "eventdate", 
    "medcode"), where = "medcode %in% .(a$medcode) & eventdate < '2000-01-01'") 

Power calculations for cohort studies

I know that this is something which is usually done prior to the study and I know Stata and (probably) R have other commands for power calculation. However, wouldn't be great to have all in one package?

Replace sqldf with RSQLite

Seems to be more robust:

q <- first_events(tab = "Therapy", columns = c("patid", "eventdate", "prodcode"),
where = "prodcode %in% .(aceI$prodcode)", sql_only = TRUE)
temp_table(db, tab_name = "aceI", select_query = q)

this doesn't always work:

ace_patients <- sqldf("SELECT * FROM aceI", connection=db)

Error in dbPreExists && !overwrite : invalid 'x' type in 'x && y'

But this does:

ace_patients <- dbGetQuery(db, "SELECT * FROM aceI")

Operation: Condition prevalence in cohort

Condition prevalence in cohort

  • chronic conditions (up to an end date)
  • conditions within a window (start and end dates)
  • conditions and drugs within a window
  • All the above with and without resolve codes

Operation: Reasons for censoring

Calculate number/proportion of people who are censored because of:

  • death;
  • transferred out;
  • end of the study;
  • End of up to standard date.

refactor select_events with ::expand_string

expand_string <- function(s){
e <- strsplit(s, "[[:space:]]+")[[1]]
parse(text = paste(lapply(e,
function(x){
if(str_detect(x, "^.")){
eval(parse(text = str_match(x, ".(.+)")[2]))
} else x
}), collapse = " "))
}

Package build warning

  • checking for missing documentation entries ... WARNING
    Undocumented code objects:
    'add_to_database' 'database' 'import_CPRD_data'
    All user-level objects in a package should have documentation entries.

Fix temp file memory management

Need a function to set where sqlite temp files write to:

  • tmp directory
  • memory
  • a given directory

To avoid getting "database or disk is full" in the first place, try this if you have lots of RAM:

pragma temp_store = 2;

That tells SQLite to put temp files in memory. (The "database or disk is full" message does not mean either that the database is full or that the disk is full! It means the temp directory is full.) I have 256G of RAM but only 2G of /tmp, so this works great for me. The more RAM you have, the bigger db files you can work with.

If you haven't got a lot of ram, try this:

pragma temp_store = 1;
pragma temp_store_directory = '/directory/with/lots/of/space';

temp_store_directory is deprecated (which is silly, since temp_store is not deprecated and requires temp_store_directory), so be wary of using this in code.

Allow SQL UNIONS in select_events

This will let you select say all events from the combination of clinical and referral tables from within SQL rather than having to drop into R to do this.

Example unions:

diag_q <- sprintf("SELECT patid, practid, eventdate, medcode FROM Clinical WHERE medcode IN (%s) UNION
SELECT patid, practid, eventdate, medcode FROM Clinical WHERE medcode IN (%s)",
paste(adhd_diagnoses$medcode, collapse=","), paste(adhd_diagnoses$medcode, collapse=","))

temp_table(db, tab_name = "ADHD_diagnoses",diag_q)

Smoking status as time-varying covariate

  • So that patients are allowed to switch from one smoking class to another (e.g. current smoker to ex-smoker);
  • It is possible to know at any time during follow-up smoking status of patients.

Generalise table/field names

Build lookup envs for different EHRs and generalise functions to access the lookup env to access the table names for the specific EHR

Average of GP visits per year

Some people want to know whether patients included in the cohort are active or non active patients. Sometimes, people argue that patients included as controls (identified from primary care databases) visit their GPs less frequently than patients with a certain condition, e.g. patients with psoriasis or diabetes , etc...
The average of GP visits per year could simply be calculated as:

  • average of any clinical, referral, immunization, test or therapy code in one year

select_by_year hangs

Don't know why this hangs...

ulcer_codes <- read.csv("data/ulcer_codes.csv")

ulcers <- select_by_year(dbname="~/shared_data/Private//rCPRD//CPRD.sqlite", tables= "Clinical", cores = 12,
year_range = 2006:2011, selector_fn = select_events,
columns = c("patid", "eventdate"),
where = "eventdate >= STARTDATE & eventdate <= ENDDATE & medcode %in% .(ulcer_codes$medcode)")

output from top:

2428 xxx 20 0 4665m 4.0g 12m S 0.0 6.3 2:17.74 rsession
2706 xxx 20 0 4665m 4.0g 3204 S 0.0 6.3 0:00.07 rsession
2707 xxx 20 0 4665m 4.0g 3204 S 0.0 6.3 0:00.09 rsession
2708 xxx 20 0 4665m 4.0g 3204 S 0.0 6.3 0:00.12 rsession
2709 xxx 20 0 4665m 4.0g 3204 S 0.0 6.3 0:00.13 rsession
2710 xxx 20 0 4665m 4.0g 3204 S 0.0 6.3 0:00.12 rsession

forked processes have zero processor use

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.