Giter Site home page Giter Site logo

getlca's Introduction

getLCA

GUIDE TO RUN THE SCRIPT getLCA.py

1) Download and unpack the getLCA master

git clone https://github.com/frederikseersholm/getLCA
cd getLCA

2) Download the NCBI taxonomy files, names.dmp and nodes.dmp

    mkdir taxdump
    wget ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz
    tar -xzf taxdump.tar.gz -C taxdump
    rm taxdump.tar.gz

3) Update the paths to names.dmp and nodes.dmp on line 3 and 4 in the script get_LCA_functions.py

4) Map reads to DB using bowtie (or a mapper of your choice) - see Prepare_ref_DB_guide.md

   bowtie2 -k 500 -p 24 -f -x $DB -U $infile --no-unal > $filename.unsorted.sam

5) Sort samfile

    sort -k1 $filename.unsorted.sam > $filename.sam
    rm $filename.unsorted.sam

6) Assign LCA

    python get_LCA.py $filename.sam

OPTIONAL:

7) Report a sorted list of vertebrate species assigned below family level:

    cat $filename.getLCA|grep -v 'NOMATCH'|grep 'genus\|family\|species\|subfamily\|subspecies\|subgenus'|grep 'Vertebrata'|awk '{print $2}'|sort|uniq -c|sort -nk1|awk -F ':' '{print $1}'

getlca's People

Contributors

frederikseersholm avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.