Giter Site home page Giter Site logo

xander_assembler's People

Contributors

chaibenl avatar wangqion avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xander_assembler's Issues

dmatrix failed with test dataset

When running Xander on the test dataset, I get the error message that dmatrix (from RDPTools) failed:

+ java -Xmx2g -jar /localenv/petersen/conda/xander/share/rdptools-2.0.3-1/Clustering.jar dmatrix -c 0.5 -I derep.fa -i ids -l 25 -o dmatrix.bin
Reading sequences(memratio=0.02610805237469355)...
Using distance model edu.msu.cme.rdp.alignment.pairwise.rna.IdentityDistanceModel
Read 1 Protein sequences (memratio=0.03132966362695213)
Reading ID Mapping from file /data/processing5/petersen/projects/lamprey-genome-assembly/code/Xander_assembler-master/testdata/k45/nifH/cluster/ids
Read mapping for 1 sequences (memratio=0.03132966362695213)
Starting distance computations, predicted max edges=1, at=Tue Mar 30 11:16:31 CEST 2021
Matrix edges computed: 52
Maximum distance: 4.9E-324
Splits: 0
Exception in thread "main" java.lang.IllegalArgumentException: Expected one or more distance files
        at edu.msu.cme.pyro.cluster.dist.MergeDistsJob.run(MergeDistsJob.java:92)
        at edu.msu.cme.pyro.cluster.dist.DistanceCalculator.main(DistanceCalculator.java:314)
        at edu.msu.cme.pyro.cluster.ClusterMain.main(ClusterMain.java:363)
+ echo 'dmatrix failed, continue with test_nifH_45_prot_merged_rmdup.fasta'
dmatrix failed, continue with test_nifH_45_prot_merged_rmdup.fasta

The rest of the pipeline seems to run and produces output for k45 (only). It seems that this should still not happen. Is it a problem, does the pipeline miss something because of this?

More information on building gene resources?

Hi

Many thanks for Xander, we would very much like to use it.

However, we are finding it difficult to figure out the exact steps for building a new gene resource that we can provide to Xander. The instructions state that we need a set of protein and gene sequences, but i think it would be really useful to have a step-by-step guide on how to do that for an example gene, e.g. mcrA

I.e. which databases to use, where and how to choose sequences etc.

Any chance of a blog post or other example code?

Thanks
Mick

Directly import Hmm profile for guided assembly

Hello,
I would like to congratulate developers of this softwares for making it available.

I have tested it and it worked like a charm..
But is there any provision to directly provide hmm files for the targeted genes in Xander?

That will help a lot

Regards

HMMER 3.1b2

Dear developer,

I am planning to use Xander for assembling antibiotic resistance genes from metagenomes.
I am using HMMER version 3.1b2. Do I need to use the hmmer-3.0_xanderpatch for creating specialized forward and reverse HMMs?
If yes, will this patch will work with my version or do you have a newer patch?
if no, then which commands/options do I need to change in prepare_gene_ref.sh?

Glad for your assistance
Best regards
Vadim

kmer not in bloomfilter warnings in stdlog.txt

Hello,

I have a set of 122M sequences from a microbial metagenome that I am running through Xander. I ran build (k=45, min_count=5), find and search separately. The contig merge and clustering parameters were left at defaults.

When I ran 'search' (using the provided rplB reference), it halted after the 'merge' step because no contigs were merged. ("Read in 625 contigs, wrote out 0 merged contigs in 0.577s"). The stdlog.txt had over 49K "kmer not in bloomfilter" warnings. Only 1251 kmers reported scores and a true/false value. Is this an expected result?

Please advise. Thanks,
Bill

Is Xander accepting paired end reads?

Dear developer,
I could not find in any place in the article/documentation if Xander works with paired-end reads? Do they need to be interlaced in a single file with consecutive lines? Or they need to be in separated files reverse-forward?
In the Xander paper the authors used merged reads, but if the insert size is > then read length x 2 as is usually the case in Illumina reads how to proceed?.
I would greatly appreciate if you can clarify on this.

Best Regards
Vadim Dubinsky

allcol option has gone.

Hi All,

I'm recently trying out building a custom reference geneset for Xander. I noticed the preparing script is a bit picky on the hmm suite version. After some struggling, I learned that the --allcol option in hmmalign (called by that script) has been long gone since 3.0 - the version which the manual asks for, for more than 9 years. The patch is also not compatible with 3.1 or later versions (line offset in 3.1 and content change in 3.2). I then assume the preparing pipeline calls for exactly some pre-release version of 3.0. but the Xander pipeline script calls for 3.1b1 (the version tested by developer, appearently). This inconsistency is confusing.
Currently I have to manually patch hmm 3.1 or 3.2 (latest) and adopt the prepare script to get rid of allcol calls. But it appears to be too many changes for a release, while I have no idea how much these changes would affect the results and accuracy. Does anyone have more suggestion on building a custom database?

Best,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.