Comments (5)
You should probably be using the tophits from the secondary output, unless you're talking about the contig annotations as opposed to the seqtable annotations?
from hecatomb.
I am talking about contig annotations.
Also, I found that the contigSeqTable.tsv has 14 columns for samples that DON'T have taxonomy, but 19 for those that do:
contigID seqID start stop len qual count CPM alnType taxMethod kingdom phylum class order family genus species baltimoreType baltimoreGroup
contig_1000 169-06-08-13-12_CAGATC:1:140311 11 252 241 17 NA NA NA NA NA NA NA NA
contig_1000 120-06-02-24-12_ATCACG:3:171191 214 406 192 0 3 2.989033237 nt LCA Viruses Cressdnaviricota Arfiviricetes Cirlivirales Circoviridae Circovirus Circovirus sp. ssDNA II
from hecatomb.
Hi Kathy,
That issue with the contigSeqTable is fixed in the dev branch and will be in the next release.
The rule PRIMARY_AA_taxonomy_assignment is part of the read-based annotations and it's only real purpose is to find sequences that look like a virus so that they can be analysed in the secondary search. You should take the annotations from the secondary searches. If you look at the secondary AA mmseqs directory, the file MMSEQS_AA_SECONDARY_tophit_aln_sorted
should have all the columns.
The direct contig annotations at the moment are a bit simplistic, but those files should be in ASSEMBLY/CONTIG_DICTIONARY/FLYE/results
. It's simplistic because it currently only uses the primary nt database, not the secondary nt database.
from hecatomb.
from hecatomb.
Should be fixed in new release
from hecatomb.
Related Issues (20)
- taxonomy improvement HOT 4
- Error in rule SECONDARY_AA_refactor_finalize HOT 1
- crash unknown HOT 1
- Missing new line in contigAnnotations.tsv HOT 1
- fastp not building HOT 8
- mmseqs: command not found HOT 7
- flye crash in population_assembly step HOT 2
- Illumina_NextSeq_Run dies immediately HOT 2
- Check for the presence of an environment variable for location of databases HOT 1
- readthedocs viral ecology R tutorial error fix HOT 1
- assembly Contigs in results folder HOT 1
- Enhancement: add whitelist to pre-processing HOT 2
- I want to create a web app for hecatomb. HOT 1
- HPC database installation problems HOT 31
- HPC Execution problem when changing to V.1.1.0 HOT 4
- Skip host removal HOT 1
- Solving OSerror Issues in a WSL Environment during the megahit step. HOT 2
- Can Hecatomb be used for searching not only viruses but also bacteria, fungi, and mycoplasma? HOT 1
- bigtable.tsv column question HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hecatomb.