Giter Site home page Giter Site logo

ohdsionazure's Introduction

Introduction

OHDSI on Azure GitHub repository is designed to ease deployment of tools provided by the Observational Health Data Sciences and Informatics (OHDSI, pronounced "Odyssey") community on to Azure. We are guided by our Hypothesis and core objectives.

Hypothesis - “OHDSI on Azure will empower IT department and operations teams to support researchers, thus increasing researchers' motivation to act on new ideas”

Objectives

  1. Decreased deployment challenges
  2. Increased access to funding
  3. Simplified adoption strategy

OHDSI on Azure is a set of scripts and templates designed to automate the deployment of the solution in the Microsoft Azure cloud using Bicep & PaaS services. It is designed to facilitate standardized scalable deployments within customer managed Azure subscriptions. Provide best practices for running OHDSI on Azure. Ease the burden of management and cost monitoring of research projects.

OHDSI on Azure has taken a container-based approach to operating OHDSI tools. Therefore, OHDSI on Azure does its best to not host code developed by the OHDSI community. Our deployment templates pull containers from Docker Hub.

We invite you and your organization to participate in the continued feature expansion of OHDSI on Azure.

This repository assumes the end user is familiar with the OHDSI community, OMOP, Azure, and Bicep.

Some of the OHDSI projects included:

  • Common Data Model (CDM), including Vocabulary
  • Atlas - OSS tool used to conduct analyses on standardized observational data converted to the OMOP Common Data Model V5
  • WebApi - contains all OHDSI RESTful services that can be called from OHDSI applications
  • Achilles - provides descriptive statistics on an OMOP CDM database
  • ETL-Synthea - Conversion from Synthea CSV to OMOP CDM

Overview

alt text

  • You can host your CDM in Azure PostgreSQL. You can load your cdm and vocabularies into Azure Storage Container as cs.gz files, and pass as a paramater in your custom deployment.

CDM Version

This setup is based on the CDM v5.4.0 and supports both PostgreSQL and Azure Synapse.

Getting Started

To get started, click on deploy to Azure button. To get more detailed instructions, please refer to the Deployment Guide.

Deploy to Azure

Create a synthetic OMOP CDM

This solution comes with a prebuilt synthetic CDM of 1,000 patients. If you wish to create your own (maybe larger) synthetic dataset you can follow a similar process described here.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

ohdsionazure's People

Contributors

anatbal avatar corygstevenson avatar daemel avatar guybartal avatar microsoftopensource avatar plooploops avatar tamirkamara avatar yradsmikham avatar yuvalyaron avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ohdsionazure's Issues

Update Documentation Based on Testing

  • Update terraform apply instructions to include example variables for omop_password
  • Update azure cli instructions to include installing Azure DevOps extension
  • Update vocabulary notes to include public demo Azure Storage Account
  • Update omop readme prerequisites

Investigate using forked github repos versus importing repo into Azure DevOps

  • Come up with pros and cons for each approach
  • Gather feedback from team and test users

Should we allow the flexibility to have both options? Or stick to one approach for simplicity?

If using forked repos in Github:

  • Remove the need to import repos and have them be in sync in two different places (AzDo and GitHub)
  • User will still need to manage forked repos and ensure that it is fetching from official microsoft/OHDSIonAzure repo (i.e. git fetch)
  • Update bootstrap/azure_devops_build_definitions.tf repository block:
    repo_type   = "GitHub"
    repo_id     = "yradsmikham/OHDSIonAzure"
    branch_name = "main"
    yml_path    = "pipelines/environments/TF-OMOP.yaml"
    service_connection_id = "<service-connection-id>"
  }
  • Refactor pipeline yaml files to refer to forked repo (?)

If importing repos into Azure DevOps:

  • Stick to current design
  • User will still need to manage forked repos and ensure that it is fetching from official microsoft/OHDSIonAzure repo (i.e. git fetch)

Documentation

Include documentation around the following:

  • Best practices for scaling OHDSI CDM
  • "Enterprise-friendly" private networking architecture
  • Best practices for secrets management (i.e. Azure SQL admin pwd (managed using TF and KeyVault), rotating passwords and more)
  • Environment promotion/management

Investigate building demo environment with github actions

  • Investigate storing terraform backend state for bootstrap for use with demo environment. This will allow for tear down and tear up for bootstrap environment.

    • One place could be as a github action pipeline artifact
    • Or you could use a designated azure storage account
  • Investigate calling ADO pipelines from workflow

  • Investigate using a designated vocabulary for the demo environment

    • Potentially store in demo Azure Storage Account

Investigate Building CI Environment using github actions

  • Build a CI environment based on PRs to main using github actions workflow
    • Potentially call ADO pipelines
    • Can you use random generated prefix/environment settings
    • Be sure to spin down once orchestration is finished
    • Can use test suite vocabulary as a starting point

Work on E2E demo videos

  • Setup E2E demo videos for setting up an OHDSI on Azure:
    • Assume this is importing from github into Azure DevOps repo
    • Run application pipelines including vocabulary setup
    • Extra credit: Captions for steps, no voice needed

Investigate Cohort Definition / Generation

  • Create a cohort definition
    • Point to a concept set (you can create one on the fly)
    • Set your initial event criteria
    • Save the cohort definition
    • Attempt to generate while pointing to your data source

This works in the demo site, but need to investigate how this works

image

Investigate Atlas Dashboard Errors

Some of the Atlas dashboards aren't getting populated (post Synthea-ETL and Achilles Run). See if it's possible to fill in other Atlas dashboards based on missing data.

Example: Check the condition occurrence dashboard, and you will see an error in the logs:

        at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:262)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1632)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatement(SQLServerPreparedStatement.java:602)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtExecCmd.doExecute(SQLServerPreparedStatement.java:524)
        at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7418)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:3272)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:247)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:222)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeQuery(SQLServerPreparedStatement.java:446)
        at org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:696)
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:638)```

Errors with running `terraform destroy` for bootstrap

  1. Add documentation around executing terraform state rm on project name if it was imported initially rather than created through terraform. Or does current documentation suffice? Double check.

  2. For certain users (i.e. Mac OS users), permission issues may occur on Terraform modules directories.

Error: local-exec provisioner error
│
│   with module.azure_devops_environment_vocabulary_release_pipeline_assignment.null_resource.azure_devops_environment_pipeline_assignment_remove,
│   on ../modules/azure_devops_environment_pipeline_assignment/azure_devops_environment_pipeline_assignment.tf line 28, in resource "null_resource" "azure_devops_environment_pipeline_assignment_remove":
│   28:   provisioner "local-exec" {
│
│ Error running command
│ '../modules/azure_devops_environment_pipeline_assignment/azure_devops_environment_pipeline_assignment.sh >
│ azure_devops_environment_pipeline_assignment.txt': exit status 126. Output: /bin/sh:
│ ../modules/azure_devops_environment_pipeline_assignment/azure_devops_environment_pipeline_assignment.sh:
│ Permission denied

Run chmod -R +x path/to/module

  1. Issue with "Extensions" property. Had to manually uninstall 'Microsoft.Azure.DevOps.Pipelines.Agent' in Azure portal, then run terraform destroy again.
Error: deleting Extension "yvonne-test-build-agent-dependencies" (Virtual Machine Scale Set "yvonne-test-ado-build-windows-vmss-agent" / Resource Group "yvonne-test-ado-bootstrap-omop-rg"): compute.VirtualMachineScaleSetExtensionsClient#Delete: Failure sending request: StatusCode=400 -- Original Error: Code="BadRequest" Message="On resource 'yvonne-test-ado-build-windows-vmss-agent', extension 'Microsoft.Azure.DevOps.Pipelines.Agent' specifies 'yvonne-test-build-agent-dependencies' in its provisionAfterExtensions property, but the extension 'yvonne-test-build-agent-dependencies' will no longer exist. First, remove the extension 'Microsoft.Azure.DevOps.Pipelines.Agent' or remove 'yvonne-test-build-agent-dependencies' from the provisionAfterExtensions property of 'Microsoft.Azure.DevOps.Pipelines.Agent

Investigate Tag Approach for Release

  • Investigate tag approach for release strategy
    • Use branches that are tagged to a specific release
    • Consider how to rollback to a prior tag
    • How to publish a release
    • What artifact should be released
    • Release notes
    • Capture dacpacs and other artifacts
  • Workaround: document approach to get 'latest' from repo:
git remote add github https://github.com/microsoft/OHDSIonAzure
git fetch github
git merge github/main

Investigate and create design documentation for automating infrastructure deployment (zeus)

Development repo for: https://github.com/yradsmikham/zeus

Leverage Zeus CLI to automate (as much as possible) deployment for infrastructure. This design doc will highlight implementation and reasoning for this approach. For example,

zeus infra create dev

will create directories for TF bootstrap and TF OMOP env.

This implies user will need to provide .tfvars -> run terraform apply for bootstrap -> kick off TF OMOP pipeline -> Upload Vocab -> Kick off rest of pipelines. Can we automate this more?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.