vanvalenlab / deepcell-applications Goto Github PK
View Code? Open in Web Editor NEWRun DeepCell Applications
License: Other
Run DeepCell Applications
License: Other
I just want to make sure this gets seen, so I opened a new issue, since its not completely related to the previous one.
As mentioned in #13 , I would like the way that --membrane-image is defined and used by default to be adjusted.
Thanks for considering this!
Cheers
Dear Mesmer team,
I have been working on incorporating Mesmer into our MCMICRO pipeline. There is a couple of things that we need to handle on our side:
However, I also noticed that Mesmer produces a 4-dimensional array, which crashes all of our downstream modules.
>>> import tifffile
>>> tifffile.imread( 'mesmer/mask.tif' ).shape
(1, 3138, 2509, 1)
In general, we have been storing probability-maps as multi-channel images (where the different channels annotate pixels with probabilities that they belong to the background or different parts of the cell) and segmentation masks as plain 2-D arrays.
>>> tifffile.imread( 'unmicst/exemplar-001_Probabilities_1.tif' ).shape
(3, 3138, 2509)
>>> tifffile.imread( 'segmentation/unmicst-exemplar-001/cell.ome.tif' ).shape
(3138, 2509)
I was wondering what your motivation was for storing segmentation masks in 4-D arrays and whether it would make sense to introduce a new flag that would collapse this to a more conventional 2-D representation. We can then make sure that MCMICRO uses this flag on our end. This will help standardize segmentation masks produced by various modules for downstream quantification, clustering, etc.
Please let me know your thoughts.
Best,
-Artem
Hi,
I have been working on image segmentation using mesmer through singularity image. I was able to run the pipeline on DAPI tif file but when I used the multi-channel image, it gave an error:
underlay of /usr/bin/nvidia-cuda-mps-control required more than 50 (512) bind mounts 2024-02-02 11:58:36.730947: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDN N) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-02-02 11:58:38.607682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9653 MB memor y: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:89:00.0, compute capability: 7.5 WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually. [2024-02-02 11:58:45,363]:[WARNING]:[tensorflow]: No training configuration found in save file, so the model was *not* compiled. Compile it manually. [2024-02-02 12:02:22,355]:[WARNING]:[root]: Found constant value array in batch 0 and channel 1. Normalizing as zeros. 2024-02-02 12:02:37.277937: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8100 /usr/local/lib/python3.8/dist-packages/deepcell_toolbox/deep_watershed.py:108: UserWarning: h_maxima peak finding algorithm was selected, but the provided image is larger than 5k x 5k pixels.This will lead to slow prediction performance. warnings.warn('h_maxima peak finding algorithm was selected, ' /usr/local/lib/python3.8/dist-packages/deepcell_toolbox/deep_watershed.py:179: FutureWarning:
selemis a deprecated argument name for
h_maxima. It will be remo ved in version 1.0. Please use
footprint instead. markers = h_maxima(image=maxima, Segmentation fault
How could I fix this?
Thank you.
I am getting the following error when attempting to run the Docker container:
Traceback (most recent call last):
File "main.py", line 156, in <module>
compartment=ARGS.compartment
File "main.py", line 77, in run
nuclear_img = deepcell.utils.get_image(nuclear_path)
AttributeError: module 'deepcell.utils' has no attribute 'get_image'
Command executed:
docker run --rm -v "$PWD":/data vanvalenlab/deepcell-applications --nuclear-image /data/exemplar-001.ome.tif --output-directory /data
where exemplar-001.ome.tif
is a stitched and registered image from MCMICRO's minimal working dataset. Assuming nextflow and Docker are installed, the .ome.tif can be reproduced with:
nextflow run labsyspharm/mcmicro/exemplar.nf --name exemplar-001 --path .
nextflow run labsyspharm/mcmicro --in ./exemplar-001 --stop-at registration
ls ./exemplar-001/registration/exemplar-001.ome.tif
Hi Deepcell-applications team,
we have written a nf-core DSL2 nextflow module to run Mesmer as part of nextflow imaging pipelines (nf-core/modules#3187). In an effort to support conda,docker and singularity, nf-core likes packages to be submitted to bioconda. This also will enable an automatic docker container creation using quay.io (via biocontainers). Is there any chance you could consider submitting mesmer to bioconda?
Cheers,
Florian
When opening a mask.tif created with mesmer through this script in windows photo viewer, it looks like a completely black image, but a mask generated using the web interface at https://www.deepcell.org/predict appears normally. The problem is not that the mask is actually a black image, because I can see it by opening it in imageJ. Any idea what differences in the local and web interfaces could lead to this?
Dear Mesmer developers,
Happy New Year! Hope you are all doing well.
Users report problems reading LZW-compressed OME-TIFF files: https://forum.image.sc/t/converting-scn-files-to-ome-tiff-for-use-in-mcmicro/61675
Possible solution(s):
pip install imagecodecs
, which is used by tifffile
to read LZW compression-Artem
The current base image relies on vanvalenlab/deepcell-tf:0.5.0-gpu
. Unfortunately, the 0.5.0 release of deepcell does not support the MultiplexSegmentation Application functionality we are looking for. The base image must be upgraded to 0.6.0 as soon as it is available.
We'll likely need to specify some additional arguments. For example, passing in compartments as a flag, specifying the nuclear channel and membrane channel separately, potentially controlling some more of the post-processing parameters.
I'm not sure if just tacking those onto argparse is the best way to do it, or if there is some other way.
I wonder if having a basic testing framework set up would make sense? Basically just mock the application and make sure that the post-processing function gets the right flags?
I'm not sure if it makes sense to put this in the github itself, or maybe in deepcell-datasets. But I think it would be useful to have an example input image and output image, so that when people run the image they can check their inputs to make sure everything worked as intended.
Without caching the model weights in the image, the container will download the weights every time it runs, inflating the processing time.
Hi Mesmer developers,
we are routinely processing large .ome.tiff images, usually with dimensions around:
39000 x 39000 pixel with a pixel resolution of 0.233 um/pixel with file sizes between 40-100GB.
We have noticed that running these jobs requires a large amount of RAM to be available, otherwise they will fail.
Is there a way to reduce memory consumption on large images? Will using a gpu node reduce this?
Thank you for your help with this question!
Hi Deepcell / Mesmer team,
we are using deepcell-applications / mesmer frequently in our Nextflow pipelines and created an nf-core module for other people to easily have access to the CLI for performing segmentation on tissue images: https://nf-co.re/modules/deepcell_mesmer
We are having some problems with the container when running it in Nextflow, due to the Entrypoint, which requires specific settings in Nextflow to be able to run, that we would like to avoid. Furthermore, these issues trickle down now to Wave container generation.
We think a simple fix for this would be to simply remove the Entrypoint layer from the container and let users just specify python run_app.py
when running the container.
deepcell-applications/Dockerfile
Line 18 in 8aae5b6
Would this be something you would consider doing for a new version of the container?
In setting up a workflow using vanvalenlab/deepcell-applications:0.3.0
I have hit an issue with the --compartment
argument.
Everything works as expected for --compartment "nuclear"
, however trying with --compartment "both"
fails with:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:375: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
"The `lr` argument is deprecated, use `learning_rate` instead.")
usage: run_app.py mesmer [-h] [--output-directory OUTPUT_DIRECTORY]
[--output-name OUTPUT_NAME]
[-L {DEBUG,INFO,WARN,ERROR,CRITICAL}] [--squeeze]
--nuclear-image NUCLEAR_PATH
[--nuclear-channel NUCLEAR_CHANNEL [NUCLEAR_CHANNEL ...]]
[--membrane-image MEMBRANE_PATH]
[--membrane-channel MEMBRANE_CHANNEL [MEMBRANE_CHANNEL ...]]
[--image-mpp IMAGE_MPP] [--batch-size BATCH_SIZE]
[--compartment {nuclear,membrane,whole-cell}]
run_app.py mesmer: error: argument --compartment/-c: invalid choice: 'both' (choose from 'nuclear', 'membrane', 'whole-cell')
While --compartment "membrane"
fails with:
Traceback (most recent call last):
File "/usr/src/app/run_app.py", line 180, in <module>
output = app.predict(image, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/deepcell/applications/mesmer.py", line 310, in predict
postprocess_kwargs=postprocess_kwargs)
File "/usr/local/lib/python3.6/dist-packages/deepcell/applications/application.py", line 448, in _predict_segmentation
label_image = self._postprocess(output_images, **postprocess_kwargs)
File "/usr/local/lib/python3.6/dist-packages/deepcell/applications/application.py", line 215, in _postprocess
image = self.postprocessing_fn(image, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/deepcell/applications/mesmer.py", line 135, in mesmer_postprocess
'Must be one of {}'.format(compartment, valid_compartments))
ValueError: Invalid compartment supplied: membrane. Must be one of ['whole-cell', 'nuclear', 'both']
Happy to provide more details if helpful.
Would it be possible to drop ENTRYPOINT
from the Dockerfile
, or alternatively build a parallel image without it?
When wrapped by Nextflow, the container is executed as:
docker run -i --cpus 1.0 -v /workspace:/workspace -w "$PWD" --name $NXF_BOXID vanvalenlab/deepcell-applications:0.4.0 /bin/bash ...
which a) resets the work directory and b) runs the command through /bin/bash
. The former results in:
python: can't open file 'run_app.py': [Errno 2] No such file or directory
because the ENTRYPOINT
directive doesn't specify the full path and relies on WORKDIR
, which gets overwritten at runtime. If the ENTRYPOINT
is modified to specify the full path to run_app.py
, the /bin/bash
wrapper then creates the following problem:
usage: run_app.py [-h] {mesmer} ...
run_app.py: error: argument app: invalid choice: '/bin/bash' (choose from 'mesmer')
My understanding is that Nextflow is not unique in this, and other workflow languages would also have issues with the entrypoint.
Consider adding support for multi-channel .ome.tiff, by re-parameterizing
--nuclear-image <file.tif> --membrane-image <file.tif>
to
--image <file.ome.tif> --nuclear-channel <int> --membrane-channel <int>
As a possible feature, allow for multiple indices to be specified, renaming the parameters to --nuclear-channels
and --membrane-channels
accordingly.
Example code for reading .ome.tiff and other formats:
https://github.com/HMS-IDAC/UnMicst/blob/340a31de8873fa247ea502e778ef14acee56984e/UnMicst2.py#L754-L764
Summary of our discussion on next steps for making the application more useful.
It's unclear how this currently implements batch processing. I see the batch-size parameter, but passing in a multi-slice tiff (single channel, for nuclear segmentation) only processes the first slice. I currently have a pipeline where the entire process relaunches between slices; I assume this is inefficient.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.