Giter Site home page Giter Site logo

Comments (4)

heche-psb avatar heche-psb commented on July 19, 2024 1

Hi, in view of "the conserved isoforms of two species", you can achieve it by two means. The first is with 2 steps, 1) identifying the isoforms per species + 2) comparing the obtained isoforms between the two species. The second is to jointly identify the isoforms per species and conserved isoforms between the two species based on the sequence clustering result and similarity matrix. wgd v2 is not specifically designed for this purpose. But you may have a try to calculate the gene length-normalized similarity scores first and then write some custom scripts to retrieve the the conserved isoforms.

from wgd.

heche-psb avatar heche-psb commented on July 19, 2024

Hi, homologous isoforms are meant for the same gene with different transcripts originated from alternative splicing. Homologous isoforms are supposed to be sequentially highly similar, typically manifested as the ~0 Ks bar in a conventional Ks plot. Usually I use CD-HIT to drop isoforms before making Ks distribution for transcriptome assembly, by which you may also try different cut-offs to identify isoforms. Note that this question is not about the software wgd v2 itself but the data preparation. Another way to identify isoforms is to simply use the clustering results from wgd dmd and the resultant diamond hit table to perform similar filtering as CD-HIT does, for instance, you can filter out transcripts with normalized similiarity scores higher than 0.95 compared to other members in the same cluster (i.e., the deduced gene family) while retain only the longest one. This way you can achieve the same job as CD-HIT while using less gene length-biased similiarity scores.

from wgd.

Tang-pro avatar Tang-pro commented on July 19, 2024

Hi, @heche-psb
Here I want to identify the conserved isoforms of two species. If cd-hit clustering is used, the differences in isoforms cannot be reflected.
So I want to use software specifically designed to identify alternative splicing to extract the different isoform sequences of each gene. I have a question here. Isn't WGD itself also an alignment of gene sequences? If I use these isoform sequence alignments, is this solution feasible?
It is difficult to compare Isoforms within species, but what about between species? Is it feasible to compare Isoforms between two species?
Thank you!

from wgd.

Tang-pro avatar Tang-pro commented on July 19, 2024

Hi, @heche-psb

Thank you so much for taking the time to reply, it means a lot to me.

from wgd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.