Giter Site home page Giter Site logo

Comments (5)

horaciobam avatar horaciobam commented on August 19, 2024

In the Retroviridae family this problem is very prevalent
image

from ovrf-viz.

horaciobam avatar horaciobam commented on August 19, 2024

Examples:

  1. Cluster 2: In HIV-1 genome NC_001802, Gag-Pol (335 to 1637) and Pr55(Gag) (335 to 1838) overlap. Same thing happen on Mouse mammary tumor virus NC_001503, where Pr160 (gene = gag-pol-pro) and Pr77 (gene = gag) overlap.

  2. Cluster 1: Snakehead retrovirus NC_001724 is an interesting case where a lot of proteins are hypothetical. For example hypothetical protein from 9345 to 9535 overlaps with hypothetical protein from 9345 to 9357. When exploring the genome, there are 17 hypothetical proteins, several of which share similar ORFS.

  3. Cluster 3: In Moloney murine sarcoma virus NC_001502, the pol polyprotein fragment coordinates are: 2485..2928; 2927..3388. We are treating every segment of a protein as an individual unit; therefore this protein seems to be overlapping to itself (this is probably a Ribosomal shift on the reading frame).

  4. Cluster 5: In Equine infectious anemia virus NC_001450 there is only one hypothetical protein with coordinates 7295..7951 that apparently has a self edge. This one I don't understand why is being created.

Overall, the self-edge count for overlapping genes in the Retroviridae family is the following:

Cluster: 1, Numer of self edges: 29
Cluster: 2, Numer of self edges: 65
Cluster: 3, Numer of self edges: 1
Cluster: 4, Numer of self edges: 34
Cluster: 5, Numer of self edges: 1
Cluster: 6, Numer of self edges: 16

From exploring the situation, it seems that self edges are formed when two proteins of the same genome are very similar to each other, therefore classified on the same cluster. However, we need to fix the current implementation where two parts of the same protein are being interpreted as individual units.

from ovrf-viz.

horaciobam avatar horaciobam commented on August 19, 2024

Word Clouds for Retroviridae family
image

from ovrf-viz.

horaciobam avatar horaciobam commented on August 19, 2024

Post cluster result

from ovrf-viz.

horaciobam avatar horaciobam commented on August 19, 2024

Cluster plot:
image
Genome plot
image

from ovrf-viz.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.