Giter Site home page Giter Site logo

vpbrendel / genhub Goto Github PK

View Code? Open in Web Editor NEW

This project forked from standage/genhub

0.0 1.0 0.0 2.8 MB

Explore eukaryotic genome composition and organization with iLoci

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.29% Python 99.39% Shell 0.33%

genhub's Introduction

GenHub

Supported Python versions PyPI version GenHub build status codecov.io coverage BSD-3 licensed

GenHub is a free open-source software framework for analyzing eukaryotic genome content and organization. The Fidibus program calculates and reports a variety of statistics on interval loci (iLoci). Fidibus can analyze user-supplied genomes, and can also retrieve and process dozens of reference genomes directly from public databases (such as NCBI RefSeq) for easily reproducible comparative analysis.

For or information, see the GenHub user manual

Obtaining GenHub

The easiest way to obtain GenHub is to install from the Python Package Index (PyPI) using the pip command.

pip install genhub

Make sure you have GenomeTools and AEGeAn installed. For more info and troubleshooting tips, be sure to check out the complete installation instructions.

Quick start: example usages

# Show all configuration settings
fidibus --help

# Compute iLoci for a user-supplied genome
fidibus --workdir=./ --local --gdna=MyGenome.fasta --gff3=MyAnnotation.gff3 \
        --prot=MyProteins.fasta --label=Gnm1 \
        prep iloci

# List all available reference genomes
fidibus list

# Download and pre-process the budding yeast genome, but do not compute iLoci
fidibus --workdir=/opt/data/genomes/ --refr=Scer download prep

# Download and completely process a few dozen Hymenopteran genomes, 4 at a time
fidibus --workdir=/opt/data/genomes/ --refr=hymenoptera --numprocs=4 \
        download prep iloci breakdown stats

# Download 9 green algae genomes, cluster proteins to identify homologous iLoci
fidibus --workdir=~/mydata/ --refrbatch=chlorophyta --numprocs=6 \
        download prep iloci breakdown cleanup cluster

# Process a user-supplied genome and several reference genomes for comparison
fidibus --workdir=/data/ --numprocs=4 --local --gdna=MyGenome.fasta \
        --gff3=MyAnnotation.gff3 --prot=MyProteins.fasta --label=Gnm1 \
        --refr=Atha,Bdis,Bole,Cari,Gmax,Grai,Mtru,Osat,Tcac \
        download prep iloci breakdown stats

For more detailed instructions on running Fidibus and other ancillary scripts, see the user manual.

Citing GenHub

GenHub is research software and must be cited if it is used in a published research project. GenHub will soon be in print, but in the mean time it can be cited as follows.

Standage DS, Brendel VP (2016) GenHub. GitHub repository, https://github.com/standage/genhub.

Additional Details

GenHub was originally dubbed HymHub and designed specifically to facilitate reproducible analysis of hymenotperan genomes. The need for a more general solution motivated the development of GenHub in its current incarnation. Rather than distributing processed data (which can occupy more than 1 GB of storage space per genome), GenHub provides portable code so that researchers can easily process reference genomes on their own computing resources. This is all tied closely to our research philosophy and our conviction that published computational results (along with supporting software and data) should be reproducible and transparent. More recently, we have implemented support for processing of user-supplied non-reference genomes.

genhub's People

Contributors

standage avatar vpbrendel avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.