Giter Site home page Giter Site logo

dsprint-pipeline's Introduction

dSPRINT

A machine learning framework predicting interaction sites in human protein domains

This repository provides companion code to

A. Etzion-Fuchs, D. Todd and M. Singh (2020) "dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains", Manuscript under review

This repository can be used as a computation pipeline, and uses Snakemake as the underlying engine.

Essentially, given a file input.hmm, with one or multiple domains which follow the syntax of a Pfam-A entry, the following computational graph of rules is run:

All rules

Output files are generated in the output folder, with the final result per-position ligand binding score generated in the file output/binding_scores.csv

        ligand_type     binding_score   domain  match_state
0       dna     0.9916359186172485      zf-C2H2 1
1       dna     0.9872528910636902      zf-C2H2 10
2       dna     0.997771143913269       zf-C2H2 11
3       dna     0.997983455657959       zf-C2H2 12
4       dna     0.9957016110420227      zf-C2H2 13
5       dna     0.9956439733505249      zf-C2H2 14

Read the Getting Started guide on how to run dSPRINT.

dsprint-pipeline's People

Contributors

anatef avatar vineetbansal avatar

Watchers

 avatar  avatar

dsprint-pipeline's Issues

Ambiguous rule with latest snakemake

With snakemake version 5.27.4, we're getting the following error on running the step for downloading data:

snakemake --cores 1 --use-conda --until download_exac download_exac_coverage download_hg19_2bit download_uniprot_fasta download_uniprot_idmapping download_phastCons download_phyloP download_pertinint_mafs install_pertinint install_hmmer2 install_hmmer3 install_tabix install_twoBitToFa --dryrun

AmbiguousRuleException:
Rules pertinint_compute_jsd and pertinint_verify_exons are ambiguous for the file /scratch/gpfs/vineetb/dsprint2/pertinint/ensembl/Homo_sapiens.GRCh37/exons/1.jsd.txt.
Consider starting rule output with a unique prefix, constrain your wildcards, or use the ruleorder directive.
Wildcards:
        pertinint_compute_jsd: chromosome=1
        pertinint_verify_exons: chromosome=1.jsd.txt
Expected input files:
        pertinint_compute_jsd: pertinint-internal/config.py /scratch/gpfs/vineetb/dsprint2/pertinint/ensembl/Homo_sapiens.GRCh37/exons/1 /scratch/gpfs/vineetb/dsprint2/pertinint/ucscgb/hg19alignment/mafs/chr1.maf.gz
        pertinint_verify_exons: pertinint-internal/config.py /scratch/gpfs/vineetb/dsprint2/pertinint/ensembl/Homo_sapiens.GRCh37/Homo_sapiens.GRCh37.pep.all.withgenelocs.fa.gz /scratch/gpfs/vineetb/dsprint2/pertinint/ensembl/Homo_sapiens.GRCh37/dna_smExpected output files:
        pertinint_compute_jsd: /scratch/gpfs/vineetb/dsprint2/pertinint/ensembl/Homo_sapiens.GRCh37/exons/1.jsd.txt
        pertinint_verify_exons: /scratch/gpfs/vineetb/dsprint2/pertinint/ensembl/Homo_sapiens.GRCh37/exons/1.jsd.txt

Till this is resolved, snakemake will be pinned to version 5.19.2 in requirements.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.