Giter Site home page Giter Site logo

esteinig / cerebro Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 1.91 MB

Metagenomic diagnostics stack for low abundance sample types and clinical reporting

License: GNU General Public License v3.0

JavaScript 0.10% CSS 0.01% HTML 0.03% TypeScript 7.13% Svelte 26.73% Nextflow 9.37% Groovy 1.11% Shell 0.08% Rust 50.50% Handlebars 1.62% Python 3.30%
aneuploidy brain central-nervous-system diagnostics illumina metagenomics production public-health

cerebro's People

Contributors

esteinig avatar

Watchers

 avatar

cerebro's Issues

Deployment adjustments for templates

Dev deployment mode should be able to access the templates from the repository (docker-compose.yml - cerebro_api)

volumes:
  {{#if dev }}
  # Path mount for application in development
  - {{{ dev }}}:/usr/src/cerebro
  - {{{ dev }}}/templates/email:/data/templates/email:ro
  - {{{ dev }}}/templates/report:/data/templates/report:ro
  {{else}}
  # Volume mount for application in production deployment
  - cerebro_api:/data
  - {{{ outdir }}}/templates/email:/data/templates/email:ro
  - {{{ outdir }}}/templates/report:/data/templates/report:ro
  {{/if}}

Dockerfile.server that is deployed currently does not include PDF compiler testing - this is necessary to load dependent LaTeX packages so that the first report generation takes little time. However - currently the Docker container user needs to be root for this to access rott installed system dependencies for tectonic.

ENV PATH="${PATH}:/opt/cerebro/bin"
RUN cargo build --release --features pdf && cp target/release/cerebro /opt/cerebro/bin
WORKDIR /opt/cerebro

RUN cerebro report compile --base-config /usr/src/cerebro/templates/report/report.toml --output test.pdf --pdf && rm test.pdf

Perhaps we can implement sudo install of dependencies with container user and remove sudo access when built?

Production surveillance

Summary

Running cerebro as an accredited service requires us to monitor assay performance over time. This issues tracks the implementation of this.

Technical

Should operate with requests to the API.

Simplify user action logs

Summary

User action logs are interweaved with security logs (stored in both admin and team databases). We need to simplify the action logs so they can be used directly on the frontpage activity log.

Troubleshooting docs

Kirsty said theat troubleshooting section is important for accreditation, add:

  • Error in sample identifier in sample sheet - how to fix this in database and workflows

`Cocogitto` changelog bumps

Cocogitto conventional commits and semantic versioning script for auto-bumping versions across the project:

  • Sveltekit application version
  • Nextflow pipeline version
  • Changelog in mkdocs documentation templates
  • Rust crate versions which includes all schema version bumps

PhiX sequencing control

Explicit sequencing control handling and exposure in QualityControl summary modules for frontend

Production execution mode

Summary

Semi-automated production runtime for accreditation; overview for aims and progress to complete the production feature on feat/production.

Aims

Stack deployment

  • deployment separation for dev and production
  • multiple stack deployment configurations
  • default configs (http-local, http-local-secure, https-web, https-web-secure)
  • test multi-stack deployment with https-web and https-web-secure configs
  • interactive setting of passwords and modification of default configs
  • enable memory overcommit for redis services, refer to documentation

Production pipeline

  • wet-lab sample-sheet creation - todo: change to tab-delimited
  • directory structure for production pipeline runs
  • input watcher and validation - test on network drives
  • input sample sheet and file validation
  • pre-workflow sample checks and registration
  • workflow launcher
  • post-workflow sample checks and registration
  • post-workflow data compilation and upload
  • notifications on slack and/or email
  • notifications and resource provision in application
  • input/output backup

User improvements

  • report amendments #6
  • page navigation progress bar #21

Testing modules

  • system integration test after production setup
  • syndrome specific integration test for results validation
  • smoke test for fast runtime validation

Documentation

  • development practices [developer]
  • production workflow setup [bioinformatician]
  • data input operating procedure [wet-lab]
  • workflow error recovery procedure [wet-lab, bioinformatician]
  • workflow error patch procedure [bioinformatician]

Steps

Full-stack setup of the Cerebro production environment for continuous operations. For setting up parallel test or development environments, see details in the documentation.

Requirements

  • Linux system
  • Mamba installation
  • Cerebro client installation
  • Cerebro stack setup and operation

Stack setup and verification



Production setup and verification

Production directory and sub-directories are setup on the system - you can read more about the types of production environments that are currently supported in the documentation.

Here we setup the RUNTIME directory where all workflows are executed, and the INPUT directory where wet-lab staff or laboratory data transfer can deposit the reads and sample sheet to trigger a worfklow execution.

# Local paths for runtime and data input
export CEREBRO_BASE_PROD=/data/cerebro/prod
export CEREBRO_INPUT_PROD=/samba/project/cerebro/prod

# Setup the runtime directory where workflows are executed
cerebro production setup-base --directory $CEREBRO_RUN_PROD

# Setup the input directory with a specific team and database upload configuration
cerebro production setup-input --directory $CEREBRO_INPUT_PROD --configuration production --team-name VIDRL --database-name "META-GP Production"

Multiple runtime and input folders can be setup for testing, development or validation configurations. Workflow execution and outputs are configured with specific production variables that ensure

Workflow setup and testing

Workflow is setup for production and integration tests are run for production.

# Check workflow help menu as sanity check
nextflow run esteinig/cerebro -r 1.0.0-nata.1 --help

# Provision the accreditation database with Cipher
nextflow run esteinig/cerebro -r 1.0.0-nata.1 -profile mamba -entry cipher --revision 1.0.0-nata.1 --outdir cipher/

# Obtain the access token for the API
export CEREBRO_API_URL="http://api.cerebro.localhost"
export CEREBRO_API_TOKEN=$(cerebro api login -u $CEREBRO_USERNAME -p $CEREBRO_PASSWORD)

# Run workflow integration tests for setup and central nervous system infections
nextflow run esteinig/cerebro -r 1.0.0-nata.1 -profile mamba,ciqa-setup@v1,ciqa-cns@v1

Sample sheet for wet-lab

Current sample sheet is focused on dry-lab operation. We need a user-safe sample sheet template that registers the library identifiers, minimal sample meta-data, wet-lab comments, aneuploidy consent and links to the files in the same input directory

Initial template: https://github.com/esteinig/cerebro/blob/feat/production/templates/production/SampleSheet.xlsx

Automated watcher and input checks

Sample sheet and fastq files (demultiplexed, de-umified) are watched and validated in the input folder. Depending on the input configuration file the watcher will run production stream and upload to the specified team-database-collection at conclusion of run - different input configuration files (folders) can be watched by different production, test, validation... watchers and outputs deposited into the appropriate database section. Triggers run of the Nextflow pipeline and notifications to Slack.


When the pipeline starts, sample identifiers are checked against the team-database-collection to ensure they are unique - the run is registered with the database and samples await confirmation of completion. If sample identifier exists in database the run fails.


Post-workflow sample checks

When the pipeline completes, sample identifiers are collected and validated against registered sample identifiers for this run. Each module (quality control, classification) is checked for completion in each sample. If a sample for some reason did not complete the module, it is marked in the database.

Post-workflow data compilation and upload

After completion, outputs are aggregated into the database models and uploaded into the specified collection via the API

Progress

  • Slack notifications - construct and send markdown messages
  • Sample sheet production template for consultation with wet-lab
  • Basic event polling and sub-polling of input folders
  • Basic input checks and validation with Slack notifications

Deselection when changing sample view

Library and controls are deselected when changing to the Report view in cerebro/data/samples/[sample] route - however, when returning to other views sometimes they are not re-selected.

`fix/vidrl-report`: report template amendments

Report template updates based on QA feedback:

  • Collection date in footer
  • Increase disclaimer size
  • Place disclaimer in footnote on every page
  • Add a unique identifier to each report document generated
  • Add a issuing laboratory section with Address and Contact

Report template updates:

  • Placeholders spaces for values not filled-in from the interface

Module modifications

  • add a negative sample option in the report interface
  • add issuing laboratory and unique report identifier fields to ClinicalReport and .toml templates

Refactor aggregate taxon model

State

Currently the cerebro.taxa section of the Cerebro data model is a HashMap<taxid, Taxon> where type taxid = String. This is a result of the aggregation function which uses sequential HashMaps to group taxa by their taxid.

Problem

HashMap is not able to be queried efficiently using MongoDB aggregation pipelines. Downstream applications eventually use a Vec<Taxon> particularly endpoints on the API.

Refactoring to Vec is necessary, but at this stage may affect a number of dependent subsystems.

Report generation interface response

Summary

Bug occurs when generating a PDF report which throws an opaque error in the UI and prevents user from navigating back to the sample table correctly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.