Giter Site home page Giter Site logo

rpkm_rnaseq_count's Introduction

RPKM_normalization

RPKM for RNAseq V1.3

Usage for sample input provided:

perl rpkm_script_beta.pl sample_count_test.count 2:9 28 > sample_count_test.rpkm

Description

In above example 'sample_count_test.count' file has count data from 2 to 9th column;
28th column has length of each genes calculated from Gencode GTF (Note below).

General usage:

perl rpkm_script_beta.pl input_count_file.txt ActualColumnStart:ActualColumnEnd ColumnGeneLength > OUTPUT_RPKM_FILE 

ActualColumnStart = For example you have GeneID in first column and counts starts from second column. This should be '2'

ActualColumnEnd = Upto which column you need RPKM

ColumnGeneLength = Length of each gene (**NOTE below)

**NOTE: Steps to prepare your input

  1. Length of the gene can be obtained from Gencode GTF by following command (Successfully tested upto Gencode V19)
  2. cat gencode.vXX.annotation.gtf | awk -F'\t' '{if($3=="gene") {split($9,a,";"); print a[1]"\t"$5-$4};}' | sed 's/[gene_id |"|]//g' > YOUR_GENE_LENGTH_FILE
  3. Combine input_count_file.txt and YOUR_GENE_LENGTH_FILE by GeneID or First column
  4. join -j1  <(sort input_count_file.txt) <(sort YOUR_GENE_LENGTH_FILE) > OUTPUT_ANNOTATED_COUNT_FILE
  5. Run the script over OUTPUT_ANNOTATED_COUNT_FILE
  6. perl rpkm_script_beta.pl OUTPUT_ANNOTATED_COUNT_FILE ActualColumnStart:ActualColumnEnd ColumnGeneLength > OUTPUT_ANNOTATED_RPKM_FILE

Description

ActualColumnStart = For example you have GeneID in first column and counts starts from second column. This should be '2'

ActualColumnEnd = Upto which column you need RPKM

ColumnGeneLength = Length of each gene

RPKM calculation

RPKM = (10^9 * C)/(N * L), where

C = Number of reads mapped to a gene

N = Total mapped reads in the experiment

L = gene length in base-pairs for a gene

Author: Santhilal Subhash

Contact: [email protected]

rpkm_rnaseq_count's People

Contributors

decodebiology avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.