Giter Site home page Giter Site logo

genelab_analysis's Introduction

download the bam files

Use the python module/instructions in Bam_Download to create a list and download the bam files from the web library into the GLDS_XXX working directory

Run samtools within the GLDS_XXX working directory to merge the replicates into single bam files, ex: samtools merge -o ./merged/GLDS.merged.bam GLDS_Rep1.bam GLDS_Rep2.bam GLDS_Rep3.bam

Remove the original bams

genelab_analysis

The script files are placed in the GLDS_### directory The scripts expect the merged bam files to be located at ./bams/merged

Submit run_splice.sh using qsub

run_splice.sh calls wrap_splicer.sh

wrap_splicer.sh takes the input for the refernce files and loops through the .bam files found at ./bams/merged the loop creates a folder named for the bam file in the GLDS_XXX working directory, calls samtools to expand the bam to sam, and runs the spliceGrapher-light.sh

spliceGrapher-light.sh runs the splicegrapher protocol and then cleans up the extra sam files created during the process

Finally, the wrap_splicer.sh removes the expanded sam file and starts the loop again with the next bam file found in the ./bams/merged directory

statistics

Use the GLDS Comparison Template to create summeries and A vs B comparisons of test sets: Use the summery list of genes and counts that are generated by the splicegrapher process per chromosome (ex: chr1_summary.csv) to populate the sheets for each respective chromosome of test set "A" and "B" (whatever samples you are comparing). The logic of the sheet will populate the Summery sheet with counts and will color the cells based on >1500< (above is red, below is blue), as well as creating lists of genes for all chromosomes (1-5) in columns A and D of the Unique Genes sheet.

To finish the comparison of unique genes a bash script is used: Find_Unique_Genes.sh is used to compare the two resultant columns from the GLDS Comparison Template - Unique Genes sheet. The input is the list of genes in the A and D columns into files called "input1" and "input2", respectively. The output files are a unique list of A vs D, D vs A, and the combined overlap. These outputs are then entered into the B, E, and H columns in the GLDS Comparison Template - Unique Genes sheet.

genelab_analysis's People

Contributors

learmonj avatar

Watchers

Pankaj Jaiswal avatar Parul Gupta avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.