Giter Site home page Giter Site logo

sinbad.0.2's Introduction

SINBAD: A pipeline for processing SINgle cell Bisulfite sequencing samples and Analysis of Data

SINBAD is an R package for processing single cell DNA methylation data. It accepts fastq files as input, performs demultiplexing, adapter trimmming, mapping, quantification, dimensionality reduction and differential methylation analysis for single cell DNA methylation datasets.

NOTE: SINBAD 0.2 is tested on paired snmC-Seq data.

System requirements

R 3.6.0 or later version is required for installation.

Installation

To install SINBAD, install vioplot first:

install.packages("vioplot")

Then type the following commands in R command prompt:

library(devtools)
install_github("yasin-uzun/SINBAD.0.2")

Once you have installed the SINBAD, verify that it is installed correctly as follows:

library(SINBAD)
SINBAD::test()

If SINBAD is installed without any problems, you should see the following message:
SINBAD installation is ok.

Dependencies

To run SINBAD, you need to have the underlying software:

  • Adapter Trimmer: Cutadapt or TrimGalor or Trimmomatic
  • Aligner: Bismark (with Bowtie) or BSMAP or BS3
  • Duplicate removal: samtools or picard
  • Demultiplexer: demultiplex_fastq.pl perl script (see below).

Note that you only need the tools you will use to be installed, i.e, you don't need BSMAP or BS3 if you will only use Bismark as the aligner.

You can install these tools by yourself. For convenience, we provide the binaries in here . Please cite the specific tool when you use it, in adition to MethylPipe.

You can download demultiplex_fastq.pl script from here.

You also need genomic sequence and annotated genomic regions for quantification of methylation calls. We provide the sequence data for hg38 assembly in here.

Configuration

To run SINBAD, you need three configuration files to modify:

  • config.general.R : Sets the progam paths to be used by MethylPipe. You need to edit this file only once.
  • config.genome.R : Sets the genomic information and paths to be used by MethylPipe. You need to generate one for each organism. We provide the built-in configuration by hg38.
  • config.project.R : You need to configure this file for your project.

You can download the templates for the configuration files from here and edit them for your purposes.

Running

SINBAD is run in two steps:

  1. Read configuration files:
read_configs(config_dir)

config_dir should point to your configuration file directory (mentioned above).

  1. Process data:
process_sample_wrapper(raw_fastq_dir, demux_index_file, working_dir, sample_name)
  • raw_fastq_dir should point to the directory containing fastq files as the input.
  • demux_index_file should point to the demultiplexing index file for the fastq files.
  • working_dir should point to the directory where all the outputs will be placed into.
  • sample_name (optional) is the name for the sample or project.

This function reads fastq files, demultiplexes them into single cells, performs filtering, mapping (alignment), DNA methylation calling and quantification, dimensionality reduction, clustering and differential methylation analysis for the given input. All the outputs are placed into related directories in working_dir.

Example Data

To try SINBAD with some example data, please contact the authors (see below).

Citation

If you use SINBAD in your study, please cite it as follows: SINBAD: A pipeline for processing SINgle cell Bisulfite sequencing samples and Analysis of Data , GitHub, 2020.

Contact

For any questions or comments, please contact Yasin Uzun (uzuny at email chop edu)

sinbad.0.2's People

Contributors

yasin-uzun avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.