Comments (8)
+1 I am curious about this myself as many of our 2x150 reads can be merged before assembly.
from megahit.
is K-max 127 long enough in this case ? is there a "rule" to determine k-max ?
Honestly, I am not sure. Basically, I would not recommend setting k to larger than 100, unless the sequencing depth is very high (some single-cell data can be > 1000x). Roughly speaking, the error rate of HiSeq is about 1%, when k goes as large as 100, you may rarely see any correct kmers from reads.
However, I don't have much experience with MiSeq. So I leave this issue open here for more discussion.
can we set k-max to a higher value ? does it make any sense ?
As I know the A5-Miseq pipeline makes use of very large k-mer size for MiSeq data. So I think probably it makes sense, but you still have to try and compare before going ahead.
To use larger k value. Two files should be modified
- modify
kMaxK
to the value you want at https://github.com/voutcn/megahit/blob/master/definitions.h#L56 - modify the
> 127
to the k-max you want at https://github.com/voutcn/megahit/blob/master/megahit#L374
from megahit.
Thanks, I'll give it a shot.
If I change the --k-max
value in the opts.txt
file and run megahit
with the --continue
flag can I start an assembly where a previous assembly left off? This would be a quick way to see if k > 127 helps with my data. Thanks again!
from megahit.
@chuckpr No, you have to rerun it from the beginning. You can let k-max go as large as you want, and then evaluate the contigs assembled from different k in intermediate_contigs/k*.contigs.fa
from megahit.
Honestly, I am not sure. Basically, I would not recommend setting k to larger than 100, unless the sequencing depth is very high
But I thought there is an iterative read correction beginning at lower kmer steps (I thought that was one of the points of doing assemblies using iterative kmer sizes)? Shouldn't that remove most such errors before you reach the high kmers?
For Miseq Data of up to 300bp l got huge improvements (concerning N50 and maximum contig length) with IDBA_UD after modifying it to allow kmer lengths upto 251 bp. I am really looking forward to trying this with megahit as megahit is far more user friendly.
from megahit.
But I have a question regarding the exact modification for megahit:
The first change (in https://github.com/voutcn/megahit/blob/master/definitions.h#L56) is straigntforward. But what do I have to change here:
modify the > 127 to the k-max you want at https://github.com/voutcn/megahit/blob/master/megahit#L374
Can't find the value I am supposed to change here.
from megahit.
@jvollme if you are using the latest codes, simply change the value kMaxK
in definitions.h
works.
from megahit.
Thanks!
Maybe you could mention that info in your manual or readme somewehere? I think this info might be interesting for a number of people also but may be hard to find (even though Megahit is quite well documented otherwise).
from megahit.
Related Issues (20)
- Understanding kmer contigs for FASTG Output HOT 2
- assembly error
- Paired-end reads and --min-count HOT 1
- Does memory avaliable need to be >= size of input data?
- Assembly killed exit code 9 despite sufficient memory? HOT 1
- Interpretation of Fasta Headers
- Exit code 6 when assembling kmers HOT 1
- Can we use megahit to assemble rna-seq data? HOT 2
- [All datasets] Suspicious report of identical # unitigs disconnected in graph pruning
- Which minimum kmer is supposed to be used on the 701G merged single-end .fa reads? HOT 1
- Exit code 9. HOT 1
- Question: how megahit deals with different size read sets?
- Exit code -15
- number of cpus//threads HOT 2
- ID in header
- I set 560 threads,but i got this log.
- When I run the command, Megahit can only recognize one fq file, what happens
- Problem with big assembly HOT 1
- Error with latest OS X Bioconda recipe HOT 1
- Assembly contiguity & sequencing depth
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from megahit.