Comments (4)
Regarding mem_free
, I do see that Nextflow passes the memory value to another SGE option, but I'm not sure if we would also need to pass it explicitly to mem_free
. This might be something we could ask Mark Miller
from speaqeasy.
Could (should?) the mem_free
option as used in the CountObjects step be scaled dynamically depending on the number of samples in samples.manifest
?
After using a larger mem_free
value in the jhpce.config
file in a few small runs last week I got an email from JHPCE support (Subj: "JHPCE Cluster Job RAM Exception Report") basically scolding me for wasting RAM on the cluster (I used mf=24G
for the countObjects step as I had a larger batch to process before that; using mf=70G
for 2 cores essentially asks for 140G RAM which would be an even bigger waste for regular size batches).
from speaqeasy.
I don't think that this is super easy to calculate. Some software has a baseline usage regardless of the number of samples and/or the size of the input (like a FASTQ with 1 million reads vs one with 100 million). Also, from the number of samples it's hard to guess the actual size of the dataset (like number of reads).
Overall, you can initially dodge the JHPCE usage reports by using bluejay
. I personally think that it's ok to overshoot a little bit, though I also try to keep track of my memory usage and adjust things manually if the default setting is way too high.
Having said that, if you want to try to give it a go, that'd be great. I think that in the past we had 2 settings: like a default and a "high mem" setting. So we basically had figured out some of these numbers for 2 broadly common scenarios.
from speaqeasy.
I think a decent estimate for the number of reads (or total bases) in a sample can be made by looking at the input file sizes (multiply by ~3 for the compressed ones I guess). Anyway I think it would be good for now if the user can provide this as an option for a specific run, by modifying their own instance of the run*.sh
script -- I hope nextflow allows this kind of override of values in jhpce.config
.
from speaqeasy.
Related Issues (20)
- Add support for running SPEAQeasy with one sample
- allow starting the workflow with existing read alignments (sorted BAM/CRAM files) HOT 3
- InferStrandness failure, --force_strand not supported HOT 3
- Construct `rse_tx` for rat HOT 5
- `BuildAnnotationObjects` halts with current version of R
- Add support for more executors
- Improve or clarify what's supported for custom annotation
- PullTranscriptFasta failing HOT 1
- BuildAnnotationObjects fails HOT 19
- [JHPCE] repeated fastqc failure on trimmed paired data HOT 15
- Guide for all setups that are atypical in some way
- Develop a 1 day (max 2) workshop
- Videos documenting how to install and run SPEAQeasy + describing outputs
- Translate course and intro documentation to Spanish
- Add support for MultiQC and maybe other tools
- Add QC metrics generated by Monorail / recount3
- Adapt code to DSL2
- Create a contribution guidelines documentation
- improve storage usage, minimize duplication of FASTQ data HOT 2
- transcript quantification (Salmon/Kallisto) should use trimmed reads HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from speaqeasy.