broadinstitute / pooled-cell-painting-image-processing Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause "New" or "Revised" License
License: BSD 3-Clause "New" or "Revised" License
but not hardcoded.
IJ.run("Canvas Size...", "width="+str(int(width)+1240)+" height="+str(int(height)+1410)+" position=Bottom-Right zero")
Right now, uploading the pipeline for step 2-3 causes an infinite recursion loop; should fix that.
Originally posted by @bethac07 in #1 (comment)
This is an issue that's a bigger potential issue for steps of the workflow that are triggered by steps that make many files as opposed to one:
If the final several jobs of a step all finish ~ the same time, several of them may start to try to trigger the next step; for example, when I just ran step 6 -> 7, it was triggered 3 times; frankly, this could have been much worse.
We'd need some sort of a state variable that tells it not to run more than once; because of latency of things like making queues, not sure that's the best thing to do. We could do that + limit concurrency, but that would be sloooooow for things that have thousands of triggers. Maybe an SNS or SQS message? need to ponder this.
Move application-specific config stuff to lambda functions so only a single file needs to be modified
Auto calculate
Auto generate CellProfiler pipelines based on the information input in the metadata.
In some cases, like for things that will go into the stitching script, we may want a structure that's something like Plate/Well/Site
, or Plate-Well/Site
, but everything should end up in a site-specific folder.
To make mining against other arrayed sets easier, where possible we should do a find-and-replace for DAPI to DNA, ConA to ER, etc - making the things compartment-named rather than dye named.
(This is easy to do if you have the whole repo open in IE VSCode, so let me know if it would be helpful for me to do so. Holding off for the moment since I'm not sure if you have active stuff you would like to push first so I don't break mergeability).
new flag to pass for fusion method.
options:
fusion_method=[Max. Intensity]
fusion_method=[Linear Blending]
It so-happens that in our historical use cases, round == "too big to stitch without quartering", but this is not true universally, esp for other well sizes (12, 24, etc). We probably want to explicitly pass whether or not to quarter alongside round, because we only want to quarter if we absolutely have to.
https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html
Basically, rather than having an S3 trigger for each Lambda function, we move those S3 triggers to step functions, and have the S3 uploads trigger the step functions and have step functions handle lambda execution, such as a) a shutdown lambda for the previous step and b) a startup lambda for the next step.
Cons: it's one more thing to do, and to have to maintain.
Pro: we separate the logic and the workflow from the actual execution of the thing. That presumably makes it easier to adjust the logic if/when we need to, it auto-graphs the logic for us, etc.
There's a component of the lambda function that checks the FIFO queue to prevent the lambda from launching duplicate infrastructure. Currently, I need to bypass this part of the code to run a lambda function manually. Should this be incorporated into the version of the lambda functions we have, or should we just have instructions/code to create a FIFO queue so it can be used with manual triggering.
See #17
(Moved here from https://github.com/broadinstitute/pooled-cell-painting-analysis/issues/83)
Right now, the workflow has several stages and/or proposed stages
This creates 2 pre-setup steps, and at least 6 handoffs. Right now, each one is manual, with manual quality checks at each. For each one, we need to a) decide how we're going to do file handling and b) decide if and how we will determine success (quantitative cutoff? How/where do we check it? Manual visual inspection of something? Same thing) or if we think it can just proceed with something like an Amazon Lambda trigger.
Pre-1-2
Pre-5-6
1 to 2
2 to 3
2 to 4
3 to 9 - MANUAL
4 to 9 - MANUAL
5 to 6
6 to 7
7 to 8
8 to 9 - MANUAL
OPTIONAL BUT REALLY NICE
Misc notes
Would be nice to use SABER Bool to automatically double machine size for pipeline 1
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.