Giter Site home page Giter Site logo

condo's Introduction

ConDo

Contact based protein Domain boundary prediction method

Pre-requisite:

PSIBLAST: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/legacy.NOTSUPPORTED/2.2.26

! (is not blast+) ! with sqeucne database such as UniRef or NR

HHblitz: https://github.com/soedinglab/hh-suite.git

PSIPRED: http://bioinfadmin.cs.ucl.ac.uk/downloads/psipred/

SANN: https://github.com/newtonjoo/sann

or https://lee.kias.re.kr

Jackhmmer:http://hmmer.org/
!must install hmmer/easel (enclosed in hmmer)

UniRef90: https://www.uniprot.org/downloads
!Use UniRef90 (recommended)

CCMPRED:git clone --recursive https://github.com/soedinglab/CCMpred.git

python2

numpy

KERAS with TensorFlow or theano

gcc or icc

Installation:

git clone https://github.com/gicsaw/ConDo

cd ConDo

gcc src/feature.c -o bin/feature -lm -fopenmp -O2

or icc src/feature.c -o bin/feature -qopenmp -O2

Edit ConDodir variable in bin/ConDo.sh

Edit hhpath, condodir, database, and jackhmmerbin variables in bin/run_jackhmmer.sh

Edit blastbin, dbname, psipred, condodir, sann, NNDB_HOME variables in bin/gen_features.sh

Edit ccmpredbindir variables in bin/run_ccmpred.sh

Run examples:

We prepared two targets such as 1c7cA and 1sxjH in examples dir

!replace $target to 1c7cA or 1sxjH

cd examples/$target

Condo.sh $target.fasta $ncpu

Output file is $target.ConDo

First and second columns of the output file are residue index and domain boundary score, respectively

The cut-off of score is 1.4

In gnuplot, plot "$target.ComDo" u 1:2 w lp, 1.4

Etc:

bin/feature # generate input features of Machine learning and some other output files such as PAS, contact mat, modularity of contact.

Input files are:

$target.fasta ! target sequence

$target.ss2 ! Secondary Structure predicted by PSIPRED

$target.a22 ! Solvent Accessibility predicted by SANN

$target.a3 ! Solvent Accessibility predicted by SANN

$target.ck2 ! sequence profile converted from chk of blast

$target.msa ! multiple sequence alignment converted by Jackhammer

$target.ccmpred ! predicted contact by CCMPRED

Outout files are $target_feature.txt ! input features for machine

$target_PAS3.txt ! PAS information

result_ccm2.txt ! predicted contact after filtering

community_ccm2.txt ! Modularity of predicted contact

How to show

In gnuplot,

set size square

plot $target_PAS3.txt u 1:2:3 w image

plot result_ccm2.txt u 1:2:3 w image

plot community_ccm2.txt u 1:2 w lp

References:

Hong, Seung Hwan, Keehyoung Joo, and Jooyoung Lee. "ConDo: Protein domain boundary prediction using coevolutionary information." Bioinformatics (2018).

https://academic.oup.com/bioinformatics/advance-article-abstract/doi/10.1093/bioinformatics/bty973/5221017

condo's People

Contributors

gicsaw avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.