Giter Site home page Giter Site logo

trellixvulnteam / dnanexus_mokabed_vtpa Goto Github PK

View Code? Open in Web Editor NEW

This project forked from moka-guys/dnanexus_mokabed

0.0 0.0 0.0 582.29 MB

Utilises the moka-guys/mokabed code to generate BED files

Shell 0.07% JavaScript 4.36% C++ 12.53% Python 67.72% Perl 0.04% C 8.69% Objective-C 0.04% Fortran 0.01% Tcl 0.61% R 0.01% MATLAB 0.01% XSLT 0.24% CSS 0.36% TeX 0.05% GAP 0.01% Makefile 0.07% HTML 4.56% QML 0.61% Smarty 0.01% QMake 0.03%

dnanexus_mokabed_vtpa's Introduction

dnanexus_mokabed - v1.3

What does this app do?

This app utilises the MokaBED code v1.2to generate BED files.

The app requires a list of gene symbols or transcripts, a set of parameters which are used to query the UCSU databases (cruzdb_refGene.db and gbCdnaInfo.db) and outputs bedfiles in a range of formats (more below).

What data are required for this app to run?

This app requires an input file:

This input file should be named PanX.txt where X is the panelnumber generated by Moka. This number will be used to name all the bedfiles produced.

  1. If using a list of genes:

    • The list of gene symbols must contain at least 1 column.
    • The first column must contain a list of gene symbols.
    • A header is present
    • You must not mix accessions with gene symbols.

    Eg:

    GeneSymbol

    A1BG

    A1CF

  2. If using a list of transcripts:

    • The list of accessions must contain at least two columns.
    • A header is present (each column contains a header).
    • The first column should contain the list of NM accessions without version numbers.
    • You must not mix gene symbols with accessions in this first column.
    • There is no stipulation on what the second column should contain but generally it contains the corresponding gene symbol for the accession in the first column.
    • The third column can be used to specify an accession number.

    Eg:

    Accession ApprovedSymbol GuysAccessionVersion

    NM_022051 EGLN1 1

    NM_000143 FH 2

    NM_001077196 PDE11A 3

The app also requires a number of parameters:

1.Coding up/down

Extends the coding exons by this number of bases.

  1. Up/Down

Extends the UTR (non-coding exons) by this number of bases.

  1. Merge boundaries

A boolean flag which, if set, combines any overlapping exons or regions into a single line in the bed file. Eg

Chr1	100	150

Chr1	140	200

If merge boundaries is True the above would be output as:

Chr1 100	200

4.Remove Chr

A boolean flag which, if set, does not add the string β€˜Chr’ to the chromosome field of the bed file.

What does this app output?

This app produces at least four outputs:

  1. PanXdata.bed

This bed file is used by HS Metrics function to generate QC.

  1. PanXdataRefSeqFormat.txt

This bed file was used by GATK depth of coverage (Not in use)

  1. PanXdatasambamba.bed

This bed file is used to calculate clinical coverage

  1. PanX_LogFile.txt

Contains all the commands and instructions used to create this bed file.

If using a list of gene symbols two further files are produced:

  1. Synonymsnotinrefgene

Lists the gene symbols that were loaded from the gene symbol list that were not found in the RefGene table.

  1. Synonymsnocodingregions

Lists gene symbols that were not found to have any coding regions.

If either file contains any gene symbols are flagged up in either of these files then the gene symbol list will need to be discussed with the bioinformatics team.

If using transcripts and one cannot be found the script will stop with an error.

Limitations of the App

The following UCSC databases are queried:

  • cruzdb_refGene.db
  • gbCdnaInfo.db

Created by

This app was created within the Viapath Genome Informatics section

dnanexus_mokabed_vtpa's People

Contributors

natashapinto avatar andyb3 avatar trellixvulnteam avatar woook avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.