Giter Site home page Giter Site logo

Refound gene tags about panaroo HOT 4 CLOSED

gtonkinhill avatar gtonkinhill commented on September 8, 2024
Refound gene tags

from panaroo.

Comments (4)

gtonkinhill avatar gtonkinhill commented on September 8, 2024

Hi,

This is a little tricky as for the refound genes there is no original locus tag. At the moment we output the protein and DNA sequences to gene_data.csv but not the location information. The final number in the refound gene name is just the order in which panaroo found it and so is not particularly helpful.

As the sequence in gene_data.csv should exactly match that in the original GFF (or its reverse complement) you could use this to search for its location. I will have a think about how best to retain this information.

Its location in relation to other genes is available in the 'final_graph.gml' file which can be viewed in cytoscape.

Hopefully this helps a bit

from panaroo.

aysunrhn avatar aysunrhn commented on September 8, 2024

Hi, thank you for the quick response. Yeah, I understand why it is difficult now, and yes it is possible to see where the refound genes are on the graph but I think in some applications it could be very useful to keep track of the locations as knowing where the gene is located within the isolate, in the context of the whole genome. So I guess I'll leave this as a feature request.

As the sequence in gene_data.csv should exactly match that in the original GFF (or its reverse complement) you could use this to search for its location. I will have a think about how best to retain this information.

Yes, this works. I can track down where the gene is by just aligning it against the whole genome sequence. For instance, the one in my question (40_refound_2166) is actually on the plasmid, this could be interesting, I'll see if I can interpret what it means. Thanks!

from panaroo.

mgalardini avatar mgalardini commented on September 8, 2024

Hi, I am also interested in knowing whether it would be possible to add the gene locations to the gene_data.csv file. It could be useful for some applications to have this info for all genes and not just the refound genes. Would it be ok if I worked on a PR to add this feature? I think I understood where the changes should go, but if you have strong opinions about having such a feature please let me know :)

from panaroo.

mgalardini avatar mgalardini commented on September 8, 2024

Ok, I managed to add the location information for the original genes, but for the refound ones I was only able to get the scaffold id, as the location info I was able to find seems to cover a larger area than the actual nucleotide sequence reported in the file. Is that the reason why it's not easy to report this information?

Here's the changes I've made so far in case they are useful: mgalardini@d487c42

from panaroo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.