rdpstaff / xander_assembler Goto Github PK
View Code? Open in Web Editor NEWA gene-targeted assembler tool
A gene-targeted assembler tool
When running Xander on the test dataset, I get the error message that dmatrix (from RDPTools) failed:
+ java -Xmx2g -jar /localenv/petersen/conda/xander/share/rdptools-2.0.3-1/Clustering.jar dmatrix -c 0.5 -I derep.fa -i ids -l 25 -o dmatrix.bin
Reading sequences(memratio=0.02610805237469355)...
Using distance model edu.msu.cme.rdp.alignment.pairwise.rna.IdentityDistanceModel
Read 1 Protein sequences (memratio=0.03132966362695213)
Reading ID Mapping from file /data/processing5/petersen/projects/lamprey-genome-assembly/code/Xander_assembler-master/testdata/k45/nifH/cluster/ids
Read mapping for 1 sequences (memratio=0.03132966362695213)
Starting distance computations, predicted max edges=1, at=Tue Mar 30 11:16:31 CEST 2021
Matrix edges computed: 52
Maximum distance: 4.9E-324
Splits: 0
Exception in thread "main" java.lang.IllegalArgumentException: Expected one or more distance files
at edu.msu.cme.pyro.cluster.dist.MergeDistsJob.run(MergeDistsJob.java:92)
at edu.msu.cme.pyro.cluster.dist.DistanceCalculator.main(DistanceCalculator.java:314)
at edu.msu.cme.pyro.cluster.ClusterMain.main(ClusterMain.java:363)
+ echo 'dmatrix failed, continue with test_nifH_45_prot_merged_rmdup.fasta'
dmatrix failed, continue with test_nifH_45_prot_merged_rmdup.fasta
The rest of the pipeline seems to run and produces output for k45 (only). It seems that this should still not happen. Is it a problem, does the pipeline miss something because of this?
Hi
Many thanks for Xander, we would very much like to use it.
However, we are finding it difficult to figure out the exact steps for building a new gene resource that we can provide to Xander. The instructions state that we need a set of protein and gene sequences, but i think it would be really useful to have a step-by-step guide on how to do that for an example gene, e.g. mcrA
I.e. which databases to use, where and how to choose sequences etc.
Any chance of a blog post or other example code?
Thanks
Mick
Hello,
I would like to congratulate developers of this softwares for making it available.
I have tested it and it worked like a charm..
But is there any provision to directly provide hmm files for the targeted genes in Xander?
That will help a lot
Regards
Dear developer,
I am planning to use Xander for assembling antibiotic resistance genes from metagenomes.
I am using HMMER version 3.1b2. Do I need to use the hmmer-3.0_xanderpatch for creating specialized forward and reverse HMMs?
If yes, will this patch will work with my version or do you have a newer patch?
if no, then which commands/options do I need to change in prepare_gene_ref.sh?
Glad for your assistance
Best regards
Vadim
Hello,
I have a set of 122M sequences from a microbial metagenome that I am running through Xander. I ran build (k=45, min_count=5), find and search separately. The contig merge and clustering parameters were left at defaults.
When I ran 'search' (using the provided rplB reference), it halted after the 'merge' step because no contigs were merged. ("Read in 625 contigs, wrote out 0 merged contigs in 0.577s"). The stdlog.txt had over 49K "kmer not in bloomfilter" warnings. Only 1251 kmers reported scores and a true/false value. Is this an expected result?
Please advise. Thanks,
Bill
causes it to fail unconditionally
Dear developer,
I could not find in any place in the article/documentation if Xander works with paired-end reads? Do they need to be interlaced in a single file with consecutive lines? Or they need to be in separated files reverse-forward?
In the Xander paper the authors used merged reads, but if the insert size is > then read length x 2 as is usually the case in Illumina reads how to proceed?.
I would greatly appreciate if you can clarify on this.
Best Regards
Vadim Dubinsky
Hi All,
I'm recently trying out building a custom reference geneset for Xander. I noticed the preparing script is a bit picky on the hmm suite version. After some struggling, I learned that the --allcol
option in hmmalign
(called by that script) has been long gone since 3.0 - the version which the manual asks for, for more than 9 years. The patch is also not compatible with 3.1 or later versions (line offset in 3.1 and content change in 3.2). I then assume the preparing pipeline calls for exactly some pre-release version of 3.0. but the Xander pipeline script calls for 3.1b1 (the version tested by developer, appearently). This inconsistency is confusing.
Currently I have to manually patch hmm 3.1 or 3.2 (latest) and adopt the prepare script to get rid of allcol calls. But it appears to be too many changes for a release, while I have no idea how much these changes would affect the results and accuracy. Does anyone have more suggestion on building a custom database?
Best,
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.