Comments (3)
Can we have that info on the manual?
Added.
Attention:
1. When extracting with BED/GTF from plain text FASTA files, the order of output sequences
are random. To keep the order, just compress the FASTA file (input.fasta) and use the
compressed one (input.fasta.gz) as the input.
2. Use "seqkit grep" for extracting subsets of sequences.
"seqtk subseq seqs.fasta id.txt" equals to
"seqkit grep -f id.txt seqs.fasta"
would it be nice adding control to subseq seed
BED records are saved in a map/hash, which is totally random, can't be controlled by seed number.
The logic is fast extracting the FASTA sequence that appeared the in BED file via the FASTA index
from seqkit.
Yes. For plain text FASTA input, the order of output records is random.
Try compressing the merged-gnm.fa
file and use merged-gnm.fa.gz
as the input.
from seqkit.
Execution time increases but it works fine. Thank you!
Can we have that info on the manual?
Also, would it be nice adding control to subseq seed, right?
Thanks a lot!
gzip merged-gnm.fa
for run in {1..4}; do
seqkit subseq --quiet --bed covid19_sample-positions.bed merged-gnm.fa.gz -j 12 | head && echo "" ;
done
>covid19_sample-conserved
GCACTAACTAAGTTCCTAACCACTT
>covid19_sample-negative
GGGCATAGTAAGGCAGTTT
>covid19_sample-uncertain
TACAGTCGTTGGTCCTCG
>covid19_sample-antisense
CTTTACTGTGGACCGTGGCA
>covid19_sample-deep
GGCAGTAATAGGTCTTAATACCA
>covid19_sample-conserved
GCACTAACTAAGTTCCTAACCACTT
>covid19_sample-negative
GGGCATAGTAAGGCAGTTT
>covid19_sample-uncertain
TACAGTCGTTGGTCCTCG
>covid19_sample-antisense
CTTTACTGTGGACCGTGGCA
>covid19_sample-deep
GGCAGTAATAGGTCTTAATACCA
>covid19_sample-conserved
GCACTAACTAAGTTCCTAACCACTT
>covid19_sample-negative
GGGCATAGTAAGGCAGTTT
>covid19_sample-uncertain
TACAGTCGTTGGTCCTCG
>covid19_sample-antisense
CTTTACTGTGGACCGTGGCA
>covid19_sample-deep
GGCAGTAATAGGTCTTAATACCA
>covid19_sample-conserved
GCACTAACTAAGTTCCTAACCACTT
>covid19_sample-negative
GGGCATAGTAAGGCAGTTT
>covid19_sample-uncertain
TACAGTCGTTGGTCCTCG
>covid19_sample-antisense
CTTTACTGTGGACCGTGGCA
>covid19_sample-deep
GGCAGTAATAGGTCTTAATACCA
from seqkit.
Related Issues (20)
- Apple preventing software load HOT 3
- The calculation of average quality score appears to be lower than it actually is HOT 1
- [feature request] file based `restart` HOT 2
- sekqit fish goroutine error HOT 7
- how to replace `n` to `-`? HOT 7
- how to remove sequences that have length < 29000bp in fasta format, not count for "-"? HOT 18
- seqkit version 2.8.1 reporting version as 2.8.0 HOT 5
- how to clean sequence based on both sequence name and sequences? HOT 10
- can seqkit handle fasta format sequence in `*.tar.xz`? HOT 6
- `seqkit amplicon --primer-file` returns a different result than `seqkit amplicon -F -R` HOT 4
- seqkit rmdup HOT 2
- rmdup memory consumption HOT 2
- seqkit split with regexp does not respect letter case overwriting file output HOT 11
- Attempting to split fastq.gz into two files based on sequence HOT 3
- "seqkit pair" command problem HOT 2
- seqkit subseq multi region HOT 2
- [ERR0] no more than one file needed (2) - seqkit sub sampling HOT 2
- seqkit stats restarts processing after x number of files HOT 2
- rna2dna to convert the sequence 'U' to 'T' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seqkit.