rdpstaff / xander_assembler Goto Github PK

View Code? Open in Web Editor NEW

19.0 19.0 4.0 4.08 MB

A gene-targeted assembler tool

Shell 61.13% Python 38.87%

xander_assembler's People

Contributors

Stargazers

Watchers

Forkers

jiarong okeefem yujulian0168 dunivint

xander_assembler's Issues

dmatrix failed with test dataset

When running Xander on the test dataset, I get the error message that dmatrix (from RDPTools) failed:

+ java -Xmx2g -jar /localenv/petersen/conda/xander/share/rdptools-2.0.3-1/Clustering.jar dmatrix -c 0.5 -I derep.fa -i ids -l 25 -o dmatrix.bin
Reading sequences(memratio=0.02610805237469355)...
Using distance model edu.msu.cme.rdp.alignment.pairwise.rna.IdentityDistanceModel
Read 1 Protein sequences (memratio=0.03132966362695213)
Reading ID Mapping from file /data/processing5/petersen/projects/lamprey-genome-assembly/code/Xander_assembler-master/testdata/k45/nifH/cluster/ids
Read mapping for 1 sequences (memratio=0.03132966362695213)
Starting distance computations, predicted max edges=1, at=Tue Mar 30 11:16:31 CEST 2021
Matrix edges computed: 52
Maximum distance: 4.9E-324
Splits: 0
Exception in thread "main" java.lang.IllegalArgumentException: Expected one or more distance files
        at edu.msu.cme.pyro.cluster.dist.MergeDistsJob.run(MergeDistsJob.java:92)
        at edu.msu.cme.pyro.cluster.dist.DistanceCalculator.main(DistanceCalculator.java:314)
        at edu.msu.cme.pyro.cluster.ClusterMain.main(ClusterMain.java:363)
+ echo 'dmatrix failed, continue with test_nifH_45_prot_merged_rmdup.fasta'
dmatrix failed, continue with test_nifH_45_prot_merged_rmdup.fasta

The rest of the pipeline seems to run and produces output for k45 (only). It seems that this should still not happen. Is it a problem, does the pipeline miss something because of this?

More information on building gene resources?

Many thanks for Xander, we would very much like to use it.

However, we are finding it difficult to figure out the exact steps for building a new gene resource that we can provide to Xander. The instructions state that we need a set of protein and gene sequences, but i think it would be really useful to have a step-by-step guide on how to do that for an example gene, e.g. mcrA

I.e. which databases to use, where and how to choose sequences etc.

Any chance of a blog post or other example code?

Thanks
Mick

Directly import Hmm profile for guided assembly

Hello,
I would like to congratulate developers of this softwares for making it available.

I have tested it and it worked like a charm..
But is there any provision to directly provide hmm files for the targeted genes in Xander?

That will help a lot

Regards

HMMER 3.1b2

Dear developer,

I am planning to use Xander for assembling antibiotic resistance genes from metagenomes.
I am using HMMER version 3.1b2. Do I need to use the hmmer-3.0_xanderpatch for creating specialized forward and reverse HMMs?
If yes, will this patch will work with my version or do you have a newer patch?
if no, then which commands/options do I need to change in prepare_gene_ref.sh?

Glad for your assistance
Best regards
Vadim

kmer not in bloomfilter warnings in stdlog.txt

Hello,

I have a set of 122M sequences from a microbial metagenome that I am running through Xander. I ran build (k=45, min_count=5), find and search separately. The contig merge and clustering parameters were left at defaults.

When I ran 'search' (using the provided rplB reference), it halted after the 'merge' step because no contigs were merged. ("Read in 625 contigs, wrote out 0 merged contigs in 0.577s"). The stdlog.txt had over 49K "kmer not in bloomfilter" warnings. Only 1251 kmers reported scores and a true/false value. Is this an expected result?

Please advise. Thanks,
Bill

line 5 of run_xander_skel.sh

causes it to fail unconditionally

Is Xander accepting paired end reads?

Dear developer,
I could not find in any place in the article/documentation if Xander works with paired-end reads? Do they need to be interlaced in a single file with consecutive lines? Or they need to be in separated files reverse-forward?
In the Xander paper the authors used merged reads, but if the insert size is > then read length x 2 as is usually the case in Illumina reads how to proceed?.
I would greatly appreciate if you can clarify on this.

Best Regards
Vadim Dubinsky

allcol option has gone.

Hi All,

I'm recently trying out building a custom reference geneset for Xander. I noticed the preparing script is a bit picky on the hmm suite version. After some struggling, I learned that the --allcol option in hmmalign (called by that script) has been long gone since 3.0 - the version which the manual asks for, for more than 9 years. The patch is also not compatible with 3.1 or later versions (line offset in 3.1 and content change in 3.2). I then assume the preparing pipeline calls for exactly some pre-release version of 3.0. but the Xander pipeline script calls for 3.1b1 (the version tested by developer, appearently). This inconsistency is confusing.
Currently I have to manually patch hmm 3.1 or 3.2 (latest) and adopt the prepare script to get rid of allcol calls. But it appears to be too many changes for a release, while I have no idea how much these changes would affect the results and accuracy. Does anyone have more suggestion on building a custom database?

Best,

rdpstaff / xander_assembler Goto Github PK

xander_assembler's People

Contributors

Stargazers

Watchers

Forkers

xander_assembler's Issues

dmatrix failed with test dataset

More information on building gene resources?

Directly import Hmm profile for guided assembly

HMMER 3.1b2

kmer not in bloomfilter warnings in stdlog.txt

line 5 of run_xander_skel.sh

Is Xander accepting paired end reads?

allcol option has gone.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent