The bifrost from ssi-dk

Add a clear job list button

Add the functionality of a button that you can click which would clear the existing job list

Add the functionality of 'tags' into bifrost

We want to be able to create a new item in samples called 'tags' which can be a multifunctional filtering tool. Tags would be saved as a list of strings. Membership in a tag can be used both for pipelines (components) in it's requirement check as well as filtering on the report side (ie only show samples belonging to the tag with this project name. Tags could also be generated from components, for example the QC stamper could assign a tag onto the sample stating it's QC is good.

This item is expected to generate multiple work items.

Add old azure devops CI/CD into github actions

Convert user feedback into a sample component

User feedback should be made into a component for tracking purposes even though it doesn't require anything to run on the server

Create github actions for creating docker images on release tags

Added for all repos

Create github actions to automate testing

Create github actions to automate testing including generation of docker files, python library creation, etc. This is expected to become multiple tasks.

Clean up setup folder. shouldnt be needed but currently dockerfiles in others are pulling common resources from it. Should move those into the components (or base)

remove_run.py unavailable

The remove_run.py script mentioned in the documentation is unavailable in the public repository: https://ssi-dk.github.io/bifrost/#/user_guide?id=removing-a-run-update

The scripts/ directory is in the .gitignore file.

Change data plots for QC to show a pregenerated density plot

Data plots are currently pulling information on many more samples than need be. To improve performance we can generate a density plot for each value based off of our existing data and load those in and then we only need to pull samples from the run, I believe we're currently pulling a set number of samples

Fix setup to install indexes for DB and species install.

Make it so the setup folder (being changed to mongoDB_setup) installs the DB Indexes.

Right now the species DB entry is also located there. Thinking that this should be installed via the component into the DB and not by the setup.

API for checking run status

Be able to check the status of a run via the command line

Add a interface for managing species and there mappings

Manage species via the gui instead of direct db adjustments. Now the only thing in the species DB should be a internal species name and series of names that can map to refer to it (i.e. S. aureus -> Staphylococcus Aureus)

Ability to submit job requests to server from dashboard

With the change to schema 2.0 we only need a component id and sample id to run a sample_component. It'd be nice to develop a page on the dashboard to allow submitting jobs to the server.

Port over non public components for schema 2.0

The following components have to be added to the new set up
reslab_stamper
kma_pointmutations
species specific FBI components

Update tests with new bifrostlib functionality

So right now Im working on dockerizing and automated testing for min_read_check and I'm trying to figure out how to prepopulate the system for analysis and in my mind I should be using the run_launcher component but can't do docker in a docker easily (perhaps a docker-compose solution is the right way?). I think what I want to do down the road is have bifrost lib updated so that the requests going back and forth for testing are to the api calls or the library that'll process the apis.

Fix bug in container automation and add more for latest and dev

Fix docker container for each component and base to point to appropriate container, also create one for latest and for dev.

Fixed for all. Really need to do this in their own branches.

Adjust it so that components store installation path

Once components store the installation path information required to run things are limited to strictly id's which should also help with web interface for launching jobs

Unify classes and api for different DB requests and api into one location

Right now we have datahandling in reporter and bifrostlib, ideally bifrostlib holds all the classes, mongo_interface the DB and the api works on top of the bifrostlib. All things should use a unified lib and not different ones.

Have the run checker default to be sorted by sample name and supplying lab

Have some way to order the run check results by default. Perhaps it makes sense to store the preferred store order as a variable somewhere.

install.sh was not updated when install directory was renamed setup

bifrost/setup/install.sh

Line 10 in ea4ce48

mv bifrost/install/adapter.fasta bifrost_resources; \

Split tests into fast and slow (or unit/integration) so that the watchdog runs quick ones first and then bigger ones after for faster feedback loops

Update documentation to reflect schema2.0

Documentation was out of date so update it all, and create a power point while I'm at it to have a master set of slides for presentations. Potentially write up a paper for bioRxiv to push as well.

reorder conda environment channels

I suspect the bug cited in the install script is due to the conda channel order.

bifrost/setup/install.sh

Line 4 in ea4ce48

#Currently a bug with samtools installation, need to redo it from this source

bifrost/envs/bifrost_for_install_full.yaml

Lines 3 to 6 in ea4ce48

    
           channels: 
        
             - defaults 
        
             - bioconda 
        
             - conda-forge

I encountered a library issue before and it was because bioconda defers non-bio dependencies to the conda-forge channel. They depend on the following channel order:

channels:
  - bioconda
  - conda-forge
  - defaults

https://bioconda.github.io/user/install.html#set-up-channels

Add a wideview for the QC dashboard

View is for a set size right now, would be nice if that could scale up for larger screen real-estate

Columns can be added/hidden by the user

Update bifrostlib for samplecomponents to reflect schema2.0 and beyond, right now its more hardcoded for snakemake purposes

Bifrostlib is part way through updating. Was thinking that each main object needs a class and that it should be updated accordingly. Also when the schema validation goes in ideally the entry can be checked against multiple schemas which may mean 2 versions of the same function. Sample, Category, Run, are mostly done converting (to the current form but needs json validation) while things like Components, SampleComponents need to be redone.

Adjust docker images to utilize a scratch folder

Adjust docker images to utilize a scratch folder, this can be done by mounting the scratch folder to a matching location in the docker image. Images have to be adjusted to utilize this.

Run class not checking against run name properly

Fix bug in datahandling.py for class Run where check for no runs in DB is checking against None but returns a list so should be checked for size (ie not [])

Ability to create a collection of runs through the UI

Through the UI we want to be able to create a list of samples then group them into a collection that can be worked with. This will create a "run" object for them and can be loaded for the user through the GUI interface.

create a docker for bifrost launcher

Add loading bars/spinners to data tables if possible

By adding loading bars/spinners for the data tables we can make the user experience nicer so that users are aware that data is loading instead of seeing no changes then all data.

Create github shields for bifrost repo

Want to add shields to monitor status's and have them handled by github actions. Info on shields can be found here: https://github.com/badges/shields

Look into anonymization mappings for fields

This should apply specifically to data sharing and can be done via our data model and/or individual field encryption

Download finders result for selected samples as csv

Generate requirements for a validation set

Want to create a validation set which can be tested for new components or lab changes. Ideally samples representative of what we do at SSI and that we can periodically run on sequencing.

Add species interface for managing species and alias's

Right now species are required to have a true term which is stored in the database. A table can be also provided for lookup names to match to these terms. Ensure there's an interface for managing this but remember to keep in mind that components are bound to the true terms.

Duplicate of #45

push newest bifrostlib to all components

Did a quick change for ssi_stamper, didn't impact others so haven't pushed to others yet

Establish test data set which can be attached via mongo atlas

Looking to set up easier testing with a small data set including localized development but sharable DB. Figured best bet for this is with mongo atlas so trying that out and getting it working. Also made a dataset available on ENA (PRJEB39131) to run this with randomized S. aureus and E. coli

Create version tag for all components and check dockerhub (update github actions)that it reflects properly

Right now all repos should be uploading to dockerhub automatically on a version number however this isn't working as intended. I need access to variables in either setup.py or dockerfile in order to pass the value accordingly in github actions.

Add input species for components with a species requirement. This is so if the user changes the species the component still has info on which species it ran against

turn bifrost into bifrostlib submodule and reporter submodule so that bifrost is just the collection and documentation organization

Fixing up submodules in repo
Add bifrost_test_data as a submodule
Making bifrostlib a submodule

Move auto pypi into bifrostlib
Making bifrost_dashboard a submodule
Move auto pypi into bifrostdashboard

Fix issue where a component requirement isn't working. Currently it's checking off it's own id and not that of the target.

new desired code should be:

elif category == "component":
                    component_to_check = requirement.split(".")[1]
                    field = requirement.split(".")[2:]
                    expected_value = requirements[requirement]
                    s_c_db = get_sample_component(sample_id=self.sample_id,
                                                  component_name=component_to_check)

pypi bifrostlib out-of-sync with repository bifrostlib

The pipeline will fail if run with the bifrostlib installed from the PYPI index instead of the repository because the datahandler.log method was renamed.

bifrost/envs/bifrost_for_install_full.yaml

Line 36 in ea4ce48

- bifrostlib

Look into showing contigs/coverage in a more intuitive way

The contigs in the QC report are represented by numbers currently. I think there's a smart way to show this more as images. With contig lengths sorted by size and heigh to show coverage, coloring could also be done to show species for the contig. This could visually show contamination more clearly as well as plasmids or pcr products (use a log scale for coverage)

Run components on samples to fill out DB when server isn't busy

The idea here is to query our local server and see how big the queue is (might not even need to do this) and submit jobs that can fill out the queue. These jobs should be generated automatically by the system. For example a api request (or query) of samples that have not run the latest components on them. Then this list is submitted to the server when it's not busy to automatically update runs.

Change it so components can be re-installed onto themselves (for development adjustments to config.yaml) as deleting from DB to update is less than ideal.

This will occur in all submodules

Adjust sample_component path to include final path not just sample path

sample_component DB variable path points to sample

ie
bifrost_dir/Sample1/

instead of
bifrost_dir/Sample1/ComponentName

change to include component name

Mountable output folder for docker images

Adjust docker images so that output goes to a set folder which can then be mounted against

Add a reason text box for supplying lab feedback

In the QC report on the per-sample view when someone is either going through the approval process of a sample include a optional textbox to comment on why they did their change.

Adjust components to have a unique name based on name, version, db_date

Adjust components to have a unique name based on name, version, db_date which would replace in the class section of objects any references to the _id. Part of this is so if the component is installed at two different institutes they're treated as the same and not as unique due to different _id's

ssi-dk / bifrost Goto Github PK

bifrost's People

Contributors

Stargazers

Watchers

Forkers

bifrost's Issues

Recommend Projects

Recommend Topics

Recommend Org