Comments (3)
The code has been updated to fix this problem. Thank you for raising this issue.
from crisp.
Bed file argument doesn't seem to work so posting my solution here in case anyone runs into similar problem
Take your bed file of chromosome regions and convert it to his format:
LG1:0-27754200
LG2:0-16089512
LG3:0-13619445
LG4:0-13404451
LG5:0-13896941
LG6:0-17789102
LG7:0-14198698
LG8:0-12717210
LG9:0-12354651
LG10:0-12360052
LG11:0-16352600
LG12:0-11514234
LG13:0-11279722
LG14:0-10670842
LG15:0-9534514
LG16:0-7238532
MT:0-16343
The next step can be skipped if you want to loop over the previous file but I found it easier this way because of how SLURM arrays work:
Split up the previous file into region files for each chromosome using a script like this:
i=0
while read p; do
i=$[i + 1]
echo "$p" > "$i"_region.txt
done < crisp_regions.txt
Then a SLURM array job can be submitted easily using this:
#!/bin/sh
#SBATCH --job-name="CRISP"
#SBATCH --output=/logs/CRISP%A_%a.log
#SBATCH --array=1-17
while IFS= read -r line; do
CRISP.binary \
--bams full_bam_list.txt \
--ref genome.fa \
--poolsize 60 \
--regions $line \
--VCF crisp_"$SLURM_ARRAY_TASK_ID".vcf \
--qvoffset 33 \
--mbq 20 \
--mmq 20 \
--minc 5
done < "$SLURM_ARRAY_TASK_ID"_region.txt
It's now doing each chromosome separately which should speed things up and allow easy concatenation at the end.
from crisp.
Thank you for posting this issue and providing the solution as well. The --regions option is better suited for analyzing chromosomes separately. The --bed option and the --regions can be used together. The --bed option should only output variants in the intervals specified in the file, I will look in the code to see if there is an error.
from crisp.
Related Issues (20)
- What parameters should be used for unknown number of samples in a pool? HOT 5
- Missing Data HOT 3
- Allele Balance HOT 7
- Depth fields HOT 1
- Segfault and wrong chromosome offset while using --region argument HOT 3
- what is ADb(ADf ADr) tag?
- CRISP.binary: corrupted double-linked list HOT 6
- Segmentation fault for a specific chromosome HOT 2
- Installation Error HOT 3
- .vcf contains extra tab HOT 3
- convert_pooled_VCF
- Poolsize = 1 HOT 1
- Installation error. does not seem to compile right HOT 1
- Issues with --bed option HOT 1
- installation problems
- VCF format showing as version 4.0
- Feature request: AD field in VCF output
- Is it possible to use CRISP to working with low coverage WGS data? HOT 1
- Linker error on Linux HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crisp.