nf-core/yascp is a bioinformatics best-practice analysis pipeline for deconvolution, qc, clustering of a single cell datasets. This is a large scale single-cell pipeline developed initially for processing Cardinal project samples, however it is applicable to any other scRNA analysis. The pipeline has been inspired by deconvolution (https://github.com/wtsi-hgi/nf_scrna_deconvolution.git ), cellbender (https://github.com/wtsi-hgi/nf_cellbender ) and qc (https://github.com/wtsi-hgi/nf_qc_cluster/tree/main ) pipelines. Input requires a tsv seperated file with paths to the Cellranger 6.11 outputs (however we will shortly add a Cellranger module to make this pipeline more transferable) and if running in an genotype additional input is required to be provided in an input.nf file pointing to the vcf location. This pipeline is designed to be used for multiple large scale single cell experiments.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!
- Cellbender
- CellSNP
- Vireo
- Souporcell
- Celltypist
- Azimuth
- BBKNN
- Harmony
- Scrublet
- Sccaf
- Lisi
- Isolation Forest
- Hard filters
- Genotype deconvolution and GT match against multiple panels.
- Citeseq DSB normalisations
- Cell genotype concordance Calculations
To understand how to prepeare your own data and how to interpret the results please refear to documents HERE
Easyest to do is using a conda enviroment.
-
Install
Nextflow
(>=21.04.0
)conda install nextflow=21.04.0
-
Install any of
Docker
,Singularity
,Podman
,Shifter
orCharliecloud
for full pipeline reproducibility (please only useConda
as a last resort; see docs) -
Download/clone the pipeline and test it on a minimal dataset with a single command:
!NOTE: you need to define your institution specific queues in the conf/base.conf or provide aditional config file with -c flag in folowing comand such as: -c /path/to/yascp/conf/extra_confs/sanger/base.conf
nextflow run /path/to/colned/yascp -profile test_full,<docker/singularity/podman/shifter/charliecloud/conda/institute>
!ALSO: by default this test dataset will run cellbender with a cpus - only performing 10 epochs. Cellbender is built for a gpu queue, so for the actual runs the deafault is to utilise_gpu = true You need to make sure that the gpu queue according to your institution is defined in confs: withLabel: gpu {} as for sanger
config file
- Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use
-profile <institute>
in your command. This will enable eitherdocker
orsingularity
and set the appropriate execution settings for your local compute environment. - If you are using
singularity
then the pipeline will auto-detect this and attempt to download the Singularity images directly as opposed to performing a conversion from Docker images. If you are persistently observing issues downloading Singularity images directly due to timeout or network issues then please use the--singularity_pull_docker_container
parameter to pull and convert the Docker image instead. Alternatively, it is highly recommended to use thenf-core download
command to pre-download all of the required containers before running the pipeline and to set theNXF_SINGULARITY_CACHEDIR
orsingularity.cacheDir
Nextflow options to be able to store and re-use the images from a central location for future pipeline runs.
- Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use
Yascp was originally written by Matiss Ozols as part of the Cardinal project with contributions from Leland Taylor, Guillaume Noell, Hannes Ponstingl, Vivek Iyer, Henry Taylor, Tobi Alegbe.
We thank the following people for their extensive assistance in the development of this pipeline: Monika Krzak
If you would like to contribute to this pipeline, please see the contributing guidelines.
For further information or help, don't hesitate to get in touch on the Slack #yascp
channel (you can join with this invite).
Currently pipeline has not been published but we would appreciate if you coul please acknowlage the use of this pipeline in your work.
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md
file.
You can cite the nf-core
publication as follows:
The nf-core framework for community-curated bioinformatics pipelines. Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen. Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.