Giter Site home page Giter Site logo

missing fields in outfile about hecatomb HOT 5 CLOSED

shandley avatar shandley commented on August 26, 2024
missing fields in outfile

from hecatomb.

Comments (5)

beardymcjohnface avatar beardymcjohnface commented on August 26, 2024

You should probably be using the tophits from the secondary output, unless you're talking about the contig annotations as opposed to the seqtable annotations?

from hecatomb.

mihinduk avatar mihinduk commented on August 26, 2024

I am talking about contig annotations.

Also, I found that the contigSeqTable.tsv has 14 columns for samples that DON'T have taxonomy, but 19 for those that do:

contigID	seqID	start	stop	len	qual	count	CPM	alnType	taxMethod	kingdom	phylum	class	order	family	genus	species	baltimoreType	baltimoreGroup
contig_1000	169-06-08-13-12_CAGATC:1:140311	11	252	241	17	NA	NA	NA	NA	NA	NA	NA	NA					
contig_1000	120-06-02-24-12_ATCACG:3:171191	214	406	192	0	3	2.989033237	nt	LCA	Viruses	Cressdnaviricota	Arfiviricetes	Cirlivirales	Circoviridae	Circovirus	Circovirus sp.	ssDNA	II

from hecatomb.

beardymcjohnface avatar beardymcjohnface commented on August 26, 2024

Hi Kathy,
That issue with the contigSeqTable is fixed in the dev branch and will be in the next release.

The rule PRIMARY_AA_taxonomy_assignment is part of the read-based annotations and it's only real purpose is to find sequences that look like a virus so that they can be analysed in the secondary search. You should take the annotations from the secondary searches. If you look at the secondary AA mmseqs directory, the file MMSEQS_AA_SECONDARY_tophit_aln_sorted should have all the columns.

The direct contig annotations at the moment are a bit simplistic, but those files should be in ASSEMBLY/CONTIG_DICTIONARY/FLYE/results. It's simplistic because it currently only uses the primary nt database, not the secondary nt database.

from hecatomb.

mihinduk avatar mihinduk commented on August 26, 2024

from hecatomb.

beardymcjohnface avatar beardymcjohnface commented on August 26, 2024

Should be fixed in new release

from hecatomb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.