Giter Site home page Giter Site logo

gregoryperkins / brain-mets-seer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mustafaascha/brain-mets-seer

0.0 1.0 0.0 7.04 MB

Reproducibility materials for a study of brain metastases incidence and Medicare claims classification accuracy

License: MIT License

Makefile 1.41% R 98.37% Shell 0.22%

brain-mets-seer's Introduction

Reproducibility repository for a SEER-Medicare study of brain metastases

DOI Project Status: Active – The project has reached a stable, usable state and is being actively developed.

This is a repository enabling replication of results from a manuscript on brain metastases. The manuscript examines frequencies (incidence proportions) of brain metastasis and the classification accuracy of Medicare claims for identifying brain metastasis at primary cancer staging workup.

The study itself is available here, and the citation is as follows:

Lifetime Occurrence of Brain Metastases Arising from Lung, Breast, and Skin Cancers in the Elderly: A SEER-Medicare Study Mustafa S. Ascha, Quinn T. Ostrom, James Wright, Priya Kumthekar, Jeremy S. Bordeaux, Andrew E. Sloan, Fredrick R. Schumacher, Carol Kruchko and Jill S. Barnholtz-Sloan. Cancer Epidemiol Biomarkers Prev May 1 2019 (28) (5) 917-925; DOI: 10.1158/1055-9965.EPI-18-1116

Instructions

Replication dry-run (no data required)

If you use a Mac, you can obtain make by installing the command-line tools that come as part of XCode (see here, here, or here for more).

If you use a Linux distribution, you almost certainly already have make installed.

Once you have installed make, you can take these steps to see what would run without actually executing any scripts.

  1. Download and unzip the project folder
  2. Open your terminal and change your working directory to the project folder
    • To do this, run the command cd {project/folder/path}, where {project/folder/path} is your project folder path. Your project folder path will probably be something like /Users/your-user-name/Downloads/brain-mets-seer-master/
  3. Run make --just-print.

Manuscript results replication (requires data)

Replication will require you to:

  1. Download and install dependencies
    • Depends on:
      • R 3.4.2, a variety of R packages
      • GNU make 4.2.1
      • pandoc 1.19.2.1.
  2. Download the project folder
    • Download the project here, or
    • Use git clone https://github.com/mustafaascha/brain-mets-seer to clone this repository
  3. Place data in the seerm folder. Note that data is not provided
  4. Run make

Note that this project takes several hours to run on a single-processor machine with 64GB of memory. You can run make with the -jN option (where N is some integer) to go through claims files in parallel, but that's not recommended because make will begin munging before all of the claims have been scanned/extracted (couldn't get those prerequisites to work right).

Project folder contents

  • augur - This R package reads SEER-Medicare data and extracts relevant rows from the claims data.

  • extraction - This folder contains R scripts to extract relevant claims data.

  • manuscript - This R package supports analysis and manuscript preparation.

  • munge - These scripts convert the data to an analyzable format. Additionally, it contains a histology code lookup table.

  • analysis - These scripts are the last step before results are presentable and can be easily used in RMarkdown.

  • seerm - This is the folder meant to hold data as provided by IMS. It is provided to demonstrate where data should be placed for study replication.

Input: Data

Data is not provided, though data files may be placed in the seerm folder to be used as input for replication studies.

Output: Manuscript products

Manuscript tables and figures are output to the tables-and-figures.html file after completing make. A cache folder is created to hold intermediate products, namely, extracted data and PEDSF table transformations.

Four claims-based brain metastasis identification algorithms are implemented in this work. Identification criteria are presented here as a two-by-two table of synchronous and lifetime chronology against diagnosis code only versus diagnosis and brain imaging code requirements.

Chronology Claims diagnosis code only Claims diagnosis and imaging codes
Synchronous Most restrictive
Lifetime Least restrictive

For synchronous brain metastasis identification, codes must have occurred within 60 days of primary cancer diagnosis. For lifetime brain metastasis identification, codes may have occurred at any time throughout claims history. When both diagnosis and imaging codes were used, the two must have occurred within 60 days of each other.

Tables describing sensitivity, specificity, positive predictive value, and Cohen's kappa concordance are generated for each of the most frequent histology categories among lung, breast, and skin cancers. In addition, demographic and clinical characteristics tables are created for each of the listed cancer sites.

Two figures, each composed of two bar graphs, are generated as part of the tables-and-figures.html output. These depict incidence proportions for each primary cancer site, stratified by race (for lung and breast cancers) and sex (for lung and skin cancers).

Appendix: References

Appendix: Filetree

Noting that seerm is an empty directory where we will place the data, the structure looks like this:

(top level)
├── augur
│   └── ...
├── extraction
│   └── ...
├──manuscript
│   └── ...
├── LICENSE
├── Makefile
├── munge
│   └── ...
├── README.md
├── analysis
│   └── ...
├── seerm
└── tables-and-figures.Rmd

After adding the data, the file structure will appear as follows:

(top level)
├── augur
│   └── ...
├── extraction
│   └── ...
├──manuscript
│   └── ...
├── LICENSE
├── Makefile
├── munge
│   └── ...
├── README.md
├── analysis
│   └── ...
├── seerm
│   ├── CCflag07.txt.gz
│   ├── ...
│   ├── dme07.file01.txt.gz
│   ├── ...
│   ├── medpar07.txt.gz
│   ├── ...
│   ├── nch07.file001.txt.gz
│   ├── ...
│   ├── outsaf07.file001.txt.gz
│   ├── ...
│   ├── pdesaf07.file01.txt.gz
│   ├── ...
│   ├── pedsf.breast.cancer.file01.txt.gz
│   ├── pedsf.breast.cancer.file02.txt.gz
│   ├── pedsf.lung.cancer.file01.txt.gz
│   ├── pedsf.lung.cancer.file02.txt.gz
│   └── pedsf.skin.cancer.txt.gz
└── tables-and-figures.Rmd

brain-mets-seer's People

Contributors

mustafaascha avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.