Giter Site home page Giter Site logo

ksplotting's Introduction

Plotting pairwise kS distributions.

Synopsis

Whole genome duplications are thought to have been important in the evolution of plant genomes. Using synonymous substitutions as a rough proxy for time since divergence of paralogous sequences, provides a method for estimating temporally clustered gene duplications. This project aims to provide an easy pipeline for plotting kS (synonymous substitutions) distributions for any set of CDS.

As an example, this is the kS plot for ORF predicted from a de novo assembly of the olive transcriptome (data from NCBI published by Muñoz-Mérida et al. The plot shows evidence for one or more large scale gene duplication events.

kS plot of paralogs in the olive transcriptome

ksplotting's People

Contributors

endymioncooper avatar

Stargazers

 avatar  avatar

Watchers

James Cloos avatar  avatar

Forkers

lacademic biocko

ksplotting's Issues

kSplotter.py - Codeml taking too long

I am working on a genome paper in which I utilize the kSplotter script to predict whole genome duplications in a new genome assembly. Following the methods described in Sollars et al., 2016, I have run a reciprocal BLASTP run on the protein files, which are then used as input into the kSplotter script. For some of these genomes, this works well - i.e. _Coffea canephora - though it appears the options for ggplot2 are no longer current. However, for my genome, as well as Fraxinus excelsior, I have run into an issue where it gets hung up at the 3rd blast cluster:

Parsing blast table, this will take a little while...

Building blast clusters, this will take a little while...

Getting KS values, this takes quite a while...

Processed cluster  1  of  5951 clusters
Processed cluster  2  of  5951 clusters

It remains stuck at this part for several days, and does not move on to the next cluster unless I kill the process with Ctrl + C, after which it reports:

*****************************************
****** ERROR CALCULATING kS SCORES ******
 offending pair Fp_g25225 Fp_g34005
    -->  Using large value for kS [=999]
*****************************************

And I notice the same behavior for the 5th blast cluster, as well as others. Is this expected behavior for the kSplotter script, or is it something amiss with my Blast results? If you need to provide more information, please let me know.

Thank you in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.