Giter Site home page Giter Site logo

jhuanglab / bioinstaller Goto Github PK

View Code? Open in Web Editor NEW
56.0 5.0 13.0 2.99 MB

A comprehensive R package to construct interactive and reproducible biological data analysis applications based on the R platform

License: Other

R 93.47% Makefile 2.15% Dockerfile 4.38%
bioinformatics-analysis ngs-analysis installer-script installer

bioinstaller's Introduction

BioInstaller

Build Status CRAN Zenodo Downloads codecov

Introduction

The increase in bioinformatics resources such as tools/scripts and databases poses a great challenge for users seeking to construct interactive and reproducible biological data analysis applications.

R language, as the most popular programming language for statistics, biological data analysis, and big data, has enabled diverse and free R packages (>14000) for different types of applications. However, due to the lack of high-performance and open-source cloud platforms based on R (e.g., Galaxy for Python users), it is still difficult for R users, especially those without web development skills, to construct interactive and reproducible biological data analysis applications supporting the upload and management of files, long-time computation, task submission, tracking of output files, exception handling, logging, export of plots and tables, and extendible plugin systems.

The collection, management, and share of various bioinformatics tools/scripts and databases are also essential for almost all bioinformatics analysis projects.

Here, we established a new platform to construct interactive and reproducible biological data analysis applications based on R language. This platform contains diverse user interfaces, including the R functions and R Shiny application, REST APIs, and support for collecting, managing, sharing, and utilizing massive bioinformatics tools/scripts and databases.

Feature:

  • Easy-to-use
  • User-friendly Shiny application
  • Integrative platform of Databases and bioinformatics resources
  • Open source and completely free
  • One-click to download and install bioinformatics resources (via R, Shiny or Opencpu REST APIs)
  • More attention for those software and database resource that have not been by other tools
  • Logging
  • System monitor
  • Task submitting system
  • Parallel tasks

Field

  • Quality Control
  • Alignment And Assembly
  • Alternative Splicing
  • ChIP-seq analysis
  • Gene Expression Data Analysis
  • Variant Detection
  • Variant Annotation
  • Virus Related
  • Statistical and Visualization
  • Noncoding RNA Related Database
  • Cancer Genomics Database
  • Regulator Related Database
  • eQTL Related Database
  • Clinical Annotation
  • Drugs Database
  • Proteomic Database
  • Software Dependence Database
  • ......

Note: We are developing bget and bioshiny projects independently for simplify the functions of download and shiny.

  • bget is an golang-based command-line tool that do not need to install any R packages.
  • bioshiny is the core shiny application of previous BioInstaller package.

Installation

CRAN

#You can install this package directly from CRAN by running (from within R):
install.packages('BioInstaller')

Github

# install.packages("devtools")
devtools::install_github("JhuangLab/BioInstaller")

Shiny application

Note, the Shiny application of BioInstaller was migrated to bioshiny project. All shiny files in this package have been removed for reducing package size.

In the new project, we are developing more free plugins of bioshiny for various bioinformatics data analysis.

echo 'export BIO_SOFTWARES_DB_ACTIVE="~/.bioshiny/info.yaml" >> ~/.bashrc'
echo 'export BIOSHINY_CONFIG="~/.bioshiny/shiny.config.yaml" >> ~/.bashrc'
. ~/.bashrc

# Start the standalone Shiny application
wget https://raw.githubusercontent.com/openbiox/bioshiny/master/bin/bioshiny_deps_r
wget https://raw.githubusercontent.com/openbiox/bioshiny/master/bin/bioshiny_start
chmod a+x bioshiny_deps_r
chmod a+x bioshiny_start
./bioshiny_deps_r

# Start Shiny application workers
Rscript -e "bioshiny::set_shiny_workers(1)"
./bioshiny_start

# or use yarn
yarn global add bioshiny
bioshiny_deps_r
Rscript -e "bioshiny::set_shiny_workers(1)"
bioshiny_start

spack and miniconda are required for extra functions.

Contributed Resources

Support Summary

Quality Control:

  • FastQC, PRINSEQ, SolexaQA, FASTX-Toolkit ...

Alignment and Assembly:

  • BWA, STAR, TMAP, Bowtie, Bowtie2, tophat2, hisat2, GMAP-GSNAP, ABySS, SSAHA2, Velvet, Edean, Trinity, oases, RUM, MapSplice2, NovoAlign ...

Variant Detection:

  • GATK, Mutect, VarScan2, FreeBayes, LoFreq, TVC, SomaticSniper, Pindel, Delly, BreakDancer, FusionCatcher, Genome STRiP, CNVnator, CNVkit, SpeedSeq ...

Variant Annotation:

  • ANNOVAR, SnpEff, VEP, oncotator ...

Utils:

  • htslib, samtools, bcftools, bedtools, bamtools, vcftools, sratools, picard, HTSeq, seqtk, UCSC Utils(blat, liftOver), bamUtil, jvarkit, bcl2fastq2, fastq_tools ...

Genome:

  • hisat2_reffa, ucsc_reffa, ensemble_reffa ...

Others:

  • sparsehash, SQLite, pigz, lzo, lzop, bzip2, zlib, armadillo, pxz, ROOT, curl, xz, pcre, R, gatk_bundle, ImageJ, igraph ...

Databases:

  • ANNOVAR, blast, CSCD, GATK_Bundle, biosystems, civic, denovo_db, dgidb, diseaseenhancer, drugbank, ecodrug, expression_atlas, funcoup, gtex, hpo, inbiomap, interpro, medreaders, mndr, msdd, omim, pancanqtl, proteinatlas, remap2, rsnp3, seecancer, srnanalyzer, superdrug2, tumorfusions, varcards ...

Docker

You can use the BioInstaller in Docker since v0.3.0. Shiny application was supported since v0.3.5.

docker pull bioinstaller/bioinstaller
docker run -it -p 80:80 -p 8004:8004 -v /tmp/download:/tmp/download bioinstaller/bioinstaller

Service list:

  • localhost/ocpu/ Opencpu service
  • localhost/shiny/BioInstaller Shiny service
  • localhost/rstudio/ Rstudio server (opencpu/opencpu)

Citation

  • Li J, Cui B, Dai Y, et al. BioInstaller: a comprehensive R package to construct interactive and reproducible biological data analysis applications based on the R platform[J]. PeerJ, 2018, 6:e5853.

How to contribute?

Please fork the GitHub BioInstaller repository, modify it, and submit a pull request to us. Especialy, the files list in contributed section should be modified when you see a tool or database that not be included in the other software warehouse.

Maintainer

Jianfeng Li

License

R package:

MIT

Related Other Resources

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

bioinstaller's People

Contributors

miachol avatar xcpanda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

bioinstaller's Issues

Share virus related tools

Configuration file: nongithub.toml
Description: HGT-ID: an efficient and sensitive workflow to detect human-viral insertion sites using next-generation sequencing data

[hgtid]
source_url = "http://bioinfo.rjh.com.cn/download/bioinstaller/hgtid/{{version}}"
version_available = ["HGT-ID_v1.0.tar.gz", "sampleResults.zip"]

Share protein function analysis related

Configuration file: nongithub.toml

Description: effusion, A method for predicting protein function, Effusion, that uses a sequence similarity network to add context for homology transfer, a probabilistic model to account for the uncertainty in labels and function propagation, and the structure of the Gene Ontology (GO) to best utilize sparse input labels and make consistent output predictions. Effusion's model makes it practical to integrate rare experimental data and abundant primary sequence and sequence similarity.

Homepage: http://www.babbittlab.ucsf.edu/effusion/

Publication: Effusion: Prediction of Protein Function from Sequence Similarity Networks. Bioinformatics. 2018 Aug 01, PMID: 30084920 DOI: 10.1093/bioinformatics/bty672

[effusion]
source_url = "http://www.babbittlab.ucsf.edu/effusion/effusion_{{version}}.tar.gz"
version_available = ["1.0.0"]

Share deep learning based methods

Configuration file: github.toml
Description: "mRNN is an implementation of a Gated Recurrent Unit (GRU) network for classification of transcripts as either coding or noncoding."

[mrnn]
github_url = "https://github.com/hendrixlab/mRNN"

Fix unit test error in windows and MAC

  • Error: initial (@test_install_uilts.R#40)
  • Failure: is.biosoftwares.db.active (@test_active.R#10)
  • Failure: set.makedir (@test_install_uilts.R#60)
  • Failure: set.makedir (@test_install_uilts.R#63)
  • Failure: set.makedir (@test_install_uilts.R#65)

Off-topic

I guess it is better to discuss all off-topic theme in this post.

Forum's guideline !!!

Introduction

The main aim of this forum is to provide a fully free place sharing the bioinformatics tools/scripts and databases. The resources recorded in the posts will be integrated into the BioInstaller configuration files. Please read this post when you want to submit a new post in this forum.

Allowed posts

  • Any bioinformatics related papers
  • Any bioinformatics related tools/scripts and databases (Both open and restricted)
  • Suggestions for improving this community
  • Bug report

Reference format for sharing papers

Required fields: author name., title. journal, year, project URL, tags

Reference format for uploading new BioInstaller item

There is three part required for uploading a new BioInstaller item:

  • Configuration file: (github.toml, nongithub.toml, db_annovar.toml, db_main.toml, or new) found in here
  • title: Short title
  • description: Summary of the tools/scripts and databases
  • publication: Publication papers information
  • TOML format configuration (see here)

Demo 1:

Configuration file: github.toml
description: GATK variant calling workflow based on [snakemake] (https://snakemake.readthedocs.io/en/stable/)
publication: Not yet

[snakemake_dna_gatk_flow]
github_url = "https://github.com/snakemake-workflows/dna-seq-gatk-variant-calling"

Demo 2:

Configuration file: nongithub.toml
description: Mirror URL of GATK4

[gatk4_jhuang_mirror]
source_url = "http://bioinfo.rjh.com.cn/download/bioinstaller/gatk/gatk-{{version}}.zip"
version_available = ["4.0.6.0", "4.0.0.0"]

BioInstaller - issue with loading the app in the web server

Hello,

We were trying to load the app in the web server after the installation of shiny and BioInstaller. We are getting the URL path for loading the page on web server, but it is not getting loaded. Please find the attachment of the error we are facing.

image

image

Any help regarding this is appreciated.

Thank you!

Share bioinformatics database

Configuration file: db_main.toml
Description: miRDB is an online database for miRNA target prediction and functional annotations.

[db_mirdb]
source_url = "http://mirdb.org/download/miRDB_v{{version}}_prediction_result.txt.gz"
version_avaliable = ["5.0", "4.0", "3.0", "2.0", "1.0"]

Building pipeline in BioInstaller

Dear All,

How can we build pipeline using BioInstaller after the installation of tools/software?

Using the tools mentioned below

  1. trimmomatic
  2. hisat2
  3. samtools
  4. feature count
  5. DESeq2

I am trying to build a transcriptome pipeline using BioInstaller. There are no manual/ procedure for building pipeline.

Any help regarding this would be appreciated!

Thanks & Regards
Anoopa

Meta Infmation of Softwares and Databases Initial

  • Add tag of softwares and databases in inst/extdata/tag.toml to found softwares more easily (BWA/GMAP)
  • Add description of softwares and databases in inst/extdata/description.toml to get the foundmental infomation e.g. publication journal, popularity (BWA/GMAP)

Share variant analysis tools

Configuration file: github.toml
Description: VariantTools, software tool for the manipulation, annotation, selection, simulation, and analysis of variants in the context of next-gen sequencing analysis.

[varianttools]
github_url = "https://github.com/vatlab/VariantTools"
install = "pip install ."

Share recently published bioinformatics methodology papers

I will integrate the items written in this post to my Bioinformatics-Resources project.

Deep learning based

Blockchain technologies in genomics

  • Ozercan, H.I., et al., Realizing the potential of blockchain technologies in genomics. Genome Res, 2018. doi: 10.1101/gr.207464.116. Tags: genomics, blockchain

Databases

  • LncBook: a curated knowledgebase of human long non-coding RNAs. Nucleic Acids Res 2019, in press.
  • PreMedKB: an integrated precision medicine knowledgebase for interpreting relationships between diseases, genes, variants and drugs. Nucleic Acids Res 2018

Share data analysis workflow

Configuration file: github.toml
Description: "This Snakemake pipeline implements the GATK best-practices workflow"

[snakemake_dna_gatk_flow]
github_url = "https://github.com/snakemake-workflows/dna-seq-gatk-variant-calling"

BioInstaller tool installation on bioshiny

Hello

How can we install tools into Bioshiny using BioInstaller? Can we install all the required tools in one go rather than downloading it separately?

Screenshot (90)

Thanking in advance for the help!

Regards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.