polar-fhir / fhircrackr Goto Github PK
View Code? Open in Web Editor NEWA package for convenient downloading fhir resources in xml format and converting to R data frames
A package for convenient downloading fhir resources in xml format and converting to R data frames
Check if chaining is quicker than creating variables in one step when manipulation data.table in cracking functions.
New elements:
--> also create corresponding arguments in fhirc_crack()
Remove fhir_style object: sep, brackets, rm_empty_cols go directly into fhir_table_description
Backwards Compatibility: When fhir_style is present in fhir_table_description, it is used as before. If elements are present in fhir_table_description too they are used but a (deprecation) warning is thrown.
--> fhir_crack argument overwrites fhir_table_description overwrites fhir_style
Currently build_tree()
always sets the attributes to value
and ignores attributes given in the column names.
bundles <- fhir_unserialize(bundles = example_bundles2)
patients <- fhir_table_description(
resource = "Patient",
brackets = c("[", "]"),
sep = " | ",
format = "wide",
keep_attr = TRUE
)
table <- fhir_crack(bundles = bundles, design = patients, verbose = 0)
cat(tree2xml(rm_ids_from_tree(build_tree(table[1,]))))
In file download_resources.R in line 623 there is a fixed regex for matching file names
files <- grep('^[0-9]+\\.xml$', dir(directory), value = T)
This enforces all file names to start with a digit, which is to strict.
I would recommend to add a parameter with the default value, such that users can override the strict pattern, e.g.
fhir_load <- function(directory, indices = NULL, pattern = '^[0-9]+\\.xml$') {
if(!dir.exists(directory)) {
warning("Cannot find the specified directory.")
return(fhir_bundle_list(list()))
}
files <- grep(pattern, dir(directory), value = T)
instead of converting a wide cracked table to a compact one
I thank @palmjulia a ton for very swiftly responding to our inquiry of adding cookies to fhir_search
!
I have tinkered with this new implementation in order to query our FHIR APIs and figured it throws the following error every time:
> library(fhircrackr)
> request <- fhir_url(url = "https://kf-api-fhir-service.kidsfirstdrc.org", resource = "Patient")
> cookies <- c(Cookie = "MY COOKIE")
> patient_bundles <- fhir_search(request = request, max_bundles = 1, verbose = 2, cookies = cookies)
Starting download of 1 bundles of resource type Patient from FHIR base URL https://kf-api-fhir-service.kidsfirstdrc.org/.
bundle[1](1): https://kf-api-fhir-service.kidsfirstdrc.org/Patient
(2): https://kf-api-fhir-service.kidsfirstdrc.org/Patient
(3): https://kf-api-fhir-service.kidsfirstdrc.org/Patient
(4): https://kf-api-fhir-service.kidsfirstdrc.org/Patient
(5): https://kf-api-fhir-service.kidsfirstdrc.org/Patient
Download interrupted.
So, I took a deep investigation into your recent implementation and figured the passed-in cookies are treated as named parameters (See here, here, and GET a url.) while it needs to be treated as headers (i.e. inside POST and GET). So currently, cookies are added in as named parameters to a URL like https://kf-api-fhir-service.kidsfirstdrc.org/Patient?Cookie="MY COOKIE"
instead of being part of headers like:
{
"Accept": "application/fhir+xml",
"Cookie": "MY COOKIE"
}
Would you mind if we ask you to refactor this like:
config = httr::add_headers(
Accept = "application/fhir+xml",
Authorization = auth$token,
Cookie = auth$cookies
)
config = httr::add_headers(
Accept = "application/fhir+xml",
Authorization = auth$token,
Cookie = auth$cookies
)
Huge thanks in advance! CC'ing @rsvasangh
when testing v2 with cracking 2600 bundles and various count of cores -> R took different amounts of RAM
e.g. ncores = 12 cracking went from 7,6 GB to 44 GB
i dont know if number of cores is related to amount of RAM taken
but so far performance is great :)
good job!
Christian
Instead of 5 times 10s use 0.1s, 1s, 5s 20s, 60s waiting times between retrys
When a indexed table has NA
in the column that is used for melting, the corresponding row get's lost. This is not the case with current version on CRAN.
Example:
d <- data.table(id = c("[1]1", "[1]2"),
melt_me = c("[1.1]a || [2.1]b", NA))
fhir_melt(d, columns = "melt_me", all_columns = TRUE, brackets = c("[", "]"), sep = " || ")
produces
id melt_me resource_identifier
1: [1]1 [1]a 1
2: [1]1 [1]b 1
Check for code chunks that query server and set eval=FALSE
then adapt the following chunks accordingly.
When the search query contains urls (e.g. in CodeSystems), the regex extracting the base URL for messages in fhir_search()
isn't working correctly.
Example:
b <- fhir_search("https://mii-agiop-3p.life.uni-leipzig.de/fhir/Observation?code=http://loinc.org|3142-7,http://loinc.org|3142-7", max_bundles = 1)
In some cases the payload of a server error doesn't come as xml but as html. In these cases the xml2::read_xml()
in fhircrackr:::error_to_file()
fails and causes fhir_search()
to fail with an unuseful error.
At the moment it looks like this only happens when the HTTP status Code is a 401, i.e. an "Unauthorized".
The best solution might be not to try saving the error message as an xml but just as plain text.
E.g. here:
bundles <- fhir_search("http://hapi.fhir.org/baseR4/Patient", max_bundles = 1)
fhir_crack(bundles, fhir_table_description("Patient", cols = "gender"))
E.g. in fhir_url(url = "base", resource = "Observation", parameters = c("code" = "123:123=12343"))
, which is needed for some snomed expressions.
This is because of the way the constructor fhir_url()
with signature c(url = "character", resource = "character", parameters = "character")
grepls for =
.
At the moment fhir_load()
loads all bundles from a given directory. The function should get an additional argument that allows the user to load only a subset of the bundles in the given directory.
wide = TRUE means cracking multiplte Entries into their own columns
wide = FALSE should be Default
Hi,
I am currently trying to run a R-script on our Blaze-fhir-server but the script gets stuck when running the fhir_search function from fhircrackr library.
I testet it with simple queries and it works when only one condition is applied (e.g. icd-code = E84.0) but with more complex queries, it gets stuck (see screenshot below).
and if I cancel the run via Ctrl+C, the error below is shown:
I'd appreciate any help! Thank you
fhir_crack needs an argument "ncores": NULL = No parallel computing, else integer
Some examples querying a FHIR Server will fail occasionnaly when the server is down.
Solutions: Either wrap examples in \dontrun{} or use try()
.
Must be resolved until 2022-11-26 !
E.g.
Initializing search via POST.
Starting download of ALL! bundles of resource type http://hapi.fhir.org/baseR4/Patient/_search from FHIR base URL http://hapi.fhir.org/baseR4/Patient/_search.
There seems to be a bug when calling fhir_design()
with fhir_table_description()
.
The error occurs with:
fhir_design(
fhir_table_description(resource = "Encounter",
cols = c(
class = "class/code",
patient = "subject/reference"
)),
fhir_table_description(resource = "Condition",
cols = c(
id = "id"))
)
But not with :
fhir_design(
fhir_table_description(resource = "Encounter",
cols = c(
class = "class/code"
)),
fhir_table_description(resource = "Condition",
cols = c(
id = "id"))
)
Use this IP for testing: http://10.255.255.1 (always gives a timeout)
Check Argument url for beeing a real url via regex!
In case of a failure print a warning!
I realised that the fhir_bundle_list()
constructor is useful when you want to combine several bundle lists (e.g. from a loop of GET requests) into a single fhir_bundle_list
before cracking. Currently it is not exported in the fhircrackr NAMESPACE but I think we should make it available to the user.
The documentation should describe a fhir_table_description as a valid standalone object and not frame it as beeing part of a fhir_design predominantly.
To avoid saving fhir xml text contents to temporary files and to load them via fhir_load
I'm using the function fhir_load_text
to load bundles from xml texts in a vector directly:
#' Load bundles from xml texts.
#' @description Reads all bundles stored as xml files from a directory.
#'
#' @param xml.texts A character vector of length one or more containing xml text.
#' @param indices A numeric vector of integers indicating which bundles from the text vector should be loaded. Defaults to NULL meaning all bundles from the directory are loaded.
#' @return A [fhir_bundle_list-class].
#' @export
fhir_load_text <-
function(xml.texts,
indices = NULL) {
checkArgString("xml.texts",
xml.texts,
pattern = "^[ \\t]*<",
vector = TRUE
)
if (is.null(indices)) {
indices <- seq_along(xml.texts)
}
if (any(indices > length(xml.texts))) {
stop("Indices are greater than number of texts available in the vector")
}
chosen.texts <- xml.texts[indices]
if (length(chosen.texts) < 1) {
warning("Cannot find any xml texts in the given character vector.", immediate. = TRUE)
return(fhircrackr::fhir_bundle_list(list()))
}
fhircrackr::fhir_bundle_list(bundles = lapply(
lst(chosen.texts),
function(x) {
xml2::read_xml(x)
}
))
}
where checkArgString
is just a function to check whether the argument xml.texts
is a character vector with each entry matching the pattern.
Message from Essen:
Tatsächlich kann unser FHIR-server Post Search beantworten.
Allerdings müssen die Werte in den next-links zu einer eigenen POST-Search Anfrage zusammengebaut werden. Ansonsten funktioniert die pagination nicht, indem man einfach in die Post-Anfrage einen Parameter "_shipPagination=eyJvZmZzZXQiOjEwMH0%3D" definiert. Der komplette Pfad für den next-link lautet nämlich "/app/FHIR/r4/Condition/_search?_count=100&_shipPagination=eyJvZmZzZXQiOjEwMH0%3D". Diese Query muss zu einer eigenen Post Anfrage zusammengebaut werden.
Man kommt sowohl per POST als auch per GET auf die nächste Page. Die Variante mit GET-Requests ist aber deutlich einfacher.
'link": [
{
"relation": "self",
"url":
"/app/FHIR/r4/Condition?_count=100&_shipPagination=eyJvZmZzZXQiOjB9"
},
{
"relation": "first",
"url":
"/app/FHIR/r4/Condition?_count=100&_shipPagination=eyJvZmZzZXQiOjB9"
},
{
"relation": "next",
"url":
"/app/FHIR/r4/Condition?_count=100&_shipPagination=eyJvZmZzZXQiOjEwMH0%
3D"
}
],
Hi, I need to access a FHIR server that supports cookie-based authentication. Does 'fhircrackr' support custom headers to define the key: value for the cookie? I would truly appreciate your timely response.
Thanks to your effort in responding to our previous inquiry, we developed a Shiny app using fhircrackr
which is an essential part of our development (See https://github.com/kids-first/clovoc-ui-data-dashboard)!
While deploying our Shiny app on our RStudio Connect instance, we ended up having this error:
09/29 20:45:31.587 (GMT)
Error in value[[3L]](cond) : unused argument (add_headers = cookies)
I figured, when installing directly from the CRAN repository, the available package is actually a previous version. In order to use the latest version on my local machine, I had to run remotes::install_github(repo = "https://github.com/POLAR-fhiR/fhircrackr")
. Could you update the CRAN repository with the master
branch so that we can install directly from CRAN?
Copy extern Changes from WP-Template into Sampling Functions.
When using the fhircrackr as mentioned in the Readme.md, the following error message is returned:
Fehler in curl::curl_fetch_memory(url, handle = handle) : URL using bad/illegal format or missing URL
We use an own in-house developed FHIR-server and it seems that the fhircrackr doesn't get along with our pagination-method.
e.g. directory in fhir_search
Decide which functions should be exported and improve documentation
fhir_url("ENDPOINT", "Patient", parameter = "code=codesystem|code", url_enc = TRUE)
ignores the url_enc=TRUE
while fhir_url("ENDPOINT", "Patient", parameter = c("code"="codesystem|code"), url_enc = TRUE)
works.
!is.null(password) || !is.null(username)
have missing !
token
to add_headers() vs. config() throws errors: former takes a character, latter httr token objectsWhen columns are given (even if they don't get custom names) and keep_attr=TRUE
, the given columns appear twice in the table. Is this a bug or did we intend to restrict the use of keep_attr
to all_columns
scenarios?
Example:
b <- fhir_unserialize(patient_bundles)
d <- fhir_crack(b, fhir_table_description("Patient",cols = c("gender", "birthDate")), keep_attr = TRUE)
Write something about html stuff (fhir_rm_tag() etc.)
request <- "https://mii-agiop-3p.life.uni-leipzig.de/fhir/Encounter?date=ge2018-01-01&class=https://www.medizininformatik-initiative.de/fhir/core/modul-fall/CodeSystem/Versorgungsfallklasse%7Cvollstationaer"
#doesnt work
base <- stringr::str_match(request, ".*:\\/\\/.*?\\/")
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.