Giter Site home page Giter Site logo

ucum's Introduction

pyucum

Sam Tomioka

pyucum can be used to generate UCUM units from LBORRESU and LBSTRESU, verify UCUM units, and convert results in LBORRESU to results in LBSTRESU using UCUM API. It is updated to work with given LOINC value to identify proper molecular weight used in the result conversion on molar based units. This tool works well with CDISC SDTM.LB and ADaM.ADLB.

1. Background

The verification of scientific units and conversion from the reported units to standard units have been always challenging for Data Science due to several reasons:

  1. Need a lookup table that consists of all possible input and output units for measurements, name of the measurements (e.g. Glucose, Weight, ...), conversion factors, molar weights etc.
  2. The names of the measurement in the lookup table and incoming data must match
  3. The incoming units must be in the lookup table
  4. Maintenance of the lookup table must be synched with standard terminology update
  5. Require careful medical review in addition to laborsome Data Science review and more...

Despite the challenges, the lookup table approach is the norm for many companies for verification of the units and conversion. Consideration was given for more systematic approach that does not require to use the lab test names[1], but some units rely on molar weight and/or valence of ion of the specific lab tests, so this approach does not solve the problem. The regulatory agencies require the sponsor to use standardized units for reporting and analysis[2]. The PMDA requires SI units for all reporting and analysis[3,4]. The differences in requirement force us to maintain region specific conversion for some measurements which add additional complexity.

The approach Jozef Aerts discussed uses RestAPI available through Unified Code for Units of Measure (UCUM) Resources which is maintained by the US National Library of Medicine (NLM)[5]. The benefit is obvious that we can potentially eliminate the maintenance of the lab conversion lookup table. Here is what they say about themselves.

The Unified Code for Units of Measure (UCUM) is a code system intended to include all units of measures being contemporarily used in international science, engineering, and business. The purpose is to facilitate unambiguous electronic communication of quantities together with their units. The focus is on electronic communication, as opposed to communication between humans. A typical application of The Unified Code for Units of Measure are electronic data interchange (EDI) protocols, but there is nothing that prevents it from being used in other types of machine communication.

The UCUM is the ISO 11240 compliant standard and has been used in ICSR E2B submissions for regulators adopted ICH E2B(R3). FDA requires the UCUM codes for the eVAERS ICSR E2B (R3) submissions, dosage strength in both content of product labeling and Drug Establishment Registration and Drug Listing. UCUM codes have been adopted by HL7 FHIR.

2. Installation

To install,

pip install pyucum

Following packages are required

  • numpy>=1.15.0
  • pandas>=0.23.0
  • tqdm>=4.32.2
  • seaborn>=0.9.0
  • matplotlib>=3.1.1

3. Some useful utilities

3.1. Apply regular expression to LBORRESU and LBSTRESU

orresu2ucum(df_,patterns) This will take datafram containing LBORRESU and LBSTRESU, and regular expression you want to pass to make UCUM units. It returns the dataframe with converted units, and the list of converted units.

Although CDISC released a downloadable CDISC UNIT and UCUM mapping xlsx file, this tool will not use it since CDISC UNIT does not cover all reported units used by the clinical laboratory/bioanalytical/PK vendors. Regular expression along with UCUM unit validity service was used to convert and verify the units provided by the lab vendors before using conversion RestAPI.

Example:

from pyucum import *

patterns = [("%","%25"),
           ("\A[xX]?10[^E]", "10*"),
           ("IU", "%5BIU%5D"),
           ("\Anan", ""),
           ("\ANONE", ""),
           ("\A[rR][Aa][Tt][Ii][Oo]", ""),
           ("\ApH", ""),
           ("Eq[l]?","eq"),
           ("\ATI/L","TR/L"),#update T to TR
           ("\AGI/L","GA/L"),#update G to GA
           ("V/V","L/L"),
           ("[a-z]{0,4}/HPF","/%5BHPF%5D"),
           ("[a-z]{0,4}/LPF","/%5BLPF%5D"),
           ("fraction of 1","1"),
            ("sec","s"),
            ("1.73m2","%7B1.73_m2%7D"),
            ("\AG/L","GA/L")
           ]

dfconverted, ucumlist=orresu2ucum(df1,patterns)

3.2. Verify units using UCUM service

ucumVerify(ucumlist,url)

Example:

from pyucum import *

url='http://xml4pharmaserver.com:8080/UCUMService2/rest'
#url=https://ucum.nlm.nih.gov/ucum-service/v1/
ucumVerify(ucumlist, url)

Output:

['g/dL = true',
 'mg/dL = true',
 'ng/mL = true',
 '10*3/uL = true',
 '%25 = true',
 '10*6/uL = true',
 '10*3/mm3 = true',
 'meq/L = true',
 'mL/min = true',
 'ng/L = true',
 'ng/dL = true',
 'u%5BIU%5D/mL = true',

3.3. Convert results in LBORRESU to results in LBSTRESU

convert_unit()

Example:

from pyucum import *

findings,full,response=convert_unit(nodupdf, url, patterns,loinconly=0)

findings[(findings['fromucum'].notnull())]

Output:

To see other utilities, please go to examples from here

3. Verification of UCUM and XML4Pharma API services

  1. The initial evaluation (2019-02-20) was done on RestAPI available through Unified Code for Units of Measure (UCUM) Resources and the findings are summarized here.

  2. The second evaluation (2019-05-05) was completed on the test version of RestAPI provided by Jozef Aerts at xml4pharma. The findings are summarized here.

  3. The third evaluation (2019-05-25) was completed on the updated test version of RestAPI provided by Jozef Aerts at xml4pharma. Several improvements were implemented by Jozef Aerts since the second evaluation was completed.

    • It accounts for the molecular weights of the analyte into the conversion between molar and mass concentrations to facilitate the conversion of the lab results, verification of the standardized lab results and LOINC code provided by the vendors.
    • The return message contains the MW that was used for the conversion.

    • Previously, there was one kind of error message related to LOINC. For example,

      Error message "ERROR: No MW value for the LOINC code xxx-x is available or the LOINC code is invalid"

      The updated service returns LOINC related error message either

      1. Invalid LOINC code XXXX
      2. No MW found for LOINC Part Number LPxxxx for LOINC code yyyy
        This updates allow us to investigate issues without browsing LOINC.
    • The list of MW for the LOINC-component-part was extended.

The findings from the third evaluation are summarized here.

ucum's People

Contributors

stomioka avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

ucum's Issues

Please tag releases in Git

It would be very helpful to have the releases of the package tagged in Git. This would help the users to reach the code for each of the releases.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.