vanvalenlab / deepcell-applications Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 6.0 47 KB

Run DeepCell Applications

License: Other

Dockerfile 1.18% Python 98.82%

deep-learning docker image-processing instance-segmentation

deepcell-applications's People

Contributors

Stargazers

Watchers

Forkers

marcovarrone artemsokolov flowuenne mitpenguin rossbar zemmourlab

deepcell-applications's Issues

Changes to membrane-image definition

I just want to make sure this gets seen, so I opened a new issue, since its not completely related to the previous one.

As mentioned in #13 , I would like the way that --membrane-image is defined and used by default to be adjusted.

Thanks for considering this!

Cheers

Integration with MCMICRO

Dear Mesmer team,

I have been working on incorporating Mesmer into our MCMICRO pipeline. There is a couple of things that we need to handle on our side:

However, I also noticed that Mesmer produces a 4-dimensional array, which crashes all of our downstream modules.

>>> import tifffile
>>> tifffile.imread( 'mesmer/mask.tif' ).shape
(1, 3138, 2509, 1)

In general, we have been storing probability-maps as multi-channel images (where the different channels annotate pixels with probabilities that they belong to the background or different parts of the cell) and segmentation masks as plain 2-D arrays.

>>> tifffile.imread( 'unmicst/exemplar-001_Probabilities_1.tif' ).shape
(3, 3138, 2509)
>>> tifffile.imread( 'segmentation/unmicst-exemplar-001/cell.ome.tif' ).shape
(3138, 2509)

I was wondering what your motivation was for storing segmentation masks in 4-D arrays and whether it would make sense to introduce a new flag that would collapse this to a more conventional 2-D representation. We can then make sure that MCMICRO uses this flag on our end. This will help standardize segmentation masks produced by various modules for downstream quantification, clustering, etc.

Please let me know your thoughts.

Best,
-Artem

Error when running mesmer on multi-channel image

Hi,

I have been working on image segmentation using mesmer through singularity image. I was able to run the pipeline on DAPI tif file but when I used the multi-channel image, it gave an error:

underlay of /usr/bin/nvidia-cuda-mps-control required more than 50 (512) bind mounts 2024-02-02 11:58:36.730947: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDN N) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-02-02 11:58:38.607682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9653 MB memor y: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:89:00.0, compute capability: 7.5 WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually. [2024-02-02 11:58:45,363]:[WARNING]:[tensorflow]: No training configuration found in save file, so the model was *not* compiled. Compile it manually. [2024-02-02 12:02:22,355]:[WARNING]:[root]: Found constant value array in batch 0 and channel 1. Normalizing as zeros. 2024-02-02 12:02:37.277937: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8100 /usr/local/lib/python3.8/dist-packages/deepcell_toolbox/deep_watershed.py:108: UserWarning: h_maxima peak finding algorithm was selected, but the provided image is larger than 5k x 5k pixels.This will lead to slow prediction performance. warnings.warn('h_maxima peak finding algorithm was selected, ' /usr/local/lib/python3.8/dist-packages/deepcell_toolbox/deep_watershed.py:179: FutureWarning:selemis a deprecated argument name forh_maxima. It will be remo ved in version 1.0. Please use footprint instead. markers = h_maxima(image=maxima, Segmentation fault

How could I fix this?
Thank you.

AttributeError: module 'deepcell.utils' has no attribute 'get_image'

I am getting the following error when attempting to run the Docker container:

Traceback (most recent call last):
  File "main.py", line 156, in <module>
    compartment=ARGS.compartment
  File "main.py", line 77, in run
    nuclear_img = deepcell.utils.get_image(nuclear_path)
AttributeError: module 'deepcell.utils' has no attribute 'get_image'

Command executed:

docker run --rm -v "$PWD":/data vanvalenlab/deepcell-applications --nuclear-image /data/exemplar-001.ome.tif --output-directory /data

where exemplar-001.ome.tif is a stitched and registered image from MCMICRO's minimal working dataset. Assuming nextflow and Docker are installed, the .ome.tif can be reproduced with:

nextflow run labsyspharm/mcmicro/exemplar.nf --name exemplar-001 --path .
nextflow run labsyspharm/mcmicro --in ./exemplar-001 --stop-at registration
ls ./exemplar-001/registration/exemplar-001.ome.tif

Adding conda package for mesmer

Hi Deepcell-applications team,

we have written a nf-core DSL2 nextflow module to run Mesmer as part of nextflow imaging pipelines (nf-core/modules#3187). In an effort to support conda,docker and singularity, nf-core likes packages to be submitted to bioconda. This also will enable an automatic docker container creation using quay.io (via biocontainers). Is there any chance you could consider submitting mesmer to bioconda?

Cheers,

Florian

generated mask appears black in windows photo viewer

When opening a mask.tif created with mesmer through this script in windows photo viewer, it looks like a completely black image, but a mask generated using the web interface at https://www.deepcell.org/predict appears normally. The problem is not that the mask is actually a black image, because I can see it by opening it in imageJ. Any idea what differences in the local and web interfaces could lead to this?

Problems reading OME-TIFF files with LZW compression

Dear Mesmer developers,

Happy New Year! Hope you are all doing well.

Users report problems reading LZW-compressed OME-TIFF files: https://forum.image.sc/t/converting-scn-files-to-ome-tiff-for-use-in-mcmicro/61675

Possible solution(s):

Increment Tensorflow version, such that the underlying Python version is >= 3.7
pip install imagecodecs, which is used by tifffile to read LZW compression

-Artem

Update Dockerfile base image

The current base image relies on vanvalenlab/deepcell-tf:0.5.0-gpu. Unfortunately, the 0.5.0 release of deepcell does not support the MultiplexSegmentation Application functionality we are looking for. The base image must be upgraded to 0.6.0 as soon as it is available.

Add support for additional command line arguments

We'll likely need to specify some additional arguments. For example, passing in compartments as a flag, specifying the nuclear channel and membrane channel separately, potentially controlling some more of the post-processing parameters.

I'm not sure if just tacking those onto argparse is the best way to do it, or if there is some other way.

I wonder if having a basic testing framework set up would make sense? Basically just mock the application and make sure that the post-processing function gets the right flags?

Add example input/output image?

I'm not sure if it makes sense to put this in the github itself, or maybe in deepcell-datasets. But I think it would be useful to have an example input image and output image, so that when people run the image they can check their inputs to make sure everything worked as intended.

Cache MultiplexSegmentation model weights in the docker image.

Without caching the model weights in the image, the container will download the weights every time it runs, inflating the processing time.

Memory usage with large .ome.tiff images

Hi Mesmer developers,

we are routinely processing large .ome.tiff images, usually with dimensions around:
39000 x 39000 pixel with a pixel resolution of 0.233 um/pixel with file sizes between 40-100GB.

We have noticed that running these jobs requires a large amount of RAM to be available, otherwise they will fail.
Is there a way to reduce memory consumption on large images? Will using a gpu node reduce this?

Thank you for your help with this question!

Remove Entrypoint from Docker image

Hi Deepcell / Mesmer team,

we are using deepcell-applications / mesmer frequently in our Nextflow pipelines and created an nf-core module for other people to easily have access to the CLI for performing segmentation on tissue images: https://nf-co.re/modules/deepcell_mesmer

We are having some problems with the container when running it in Nextflow, due to the Entrypoint, which requires specific settings in Nextflow to be able to run, that we would like to avoid. Furthermore, these issues trickle down now to Wave container generation.

We think a simple fix for this would be to simply remove the Entrypoint layer from the container and let users just specify python run_app.py when running the container.

deepcell-applications/Dockerfile

Line 18 in 8aae5b6

ENTRYPOINT ["python", "run_app.py"]

Would this be something you would consider doing for a new version of the container?

--compartment argument mismatch

In setting up a workflow using vanvalenlab/deepcell-applications:0.3.0 I have hit an issue with the --compartment argument.

Everything works as expected for --compartment "nuclear", however trying with --compartment "both" fails with:

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:375: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
  "The `lr` argument is deprecated, use `learning_rate` instead.")
usage: run_app.py mesmer [-h] [--output-directory OUTPUT_DIRECTORY]
                         [--output-name OUTPUT_NAME]
                         [-L {DEBUG,INFO,WARN,ERROR,CRITICAL}] [--squeeze]
                         --nuclear-image NUCLEAR_PATH
                         [--nuclear-channel NUCLEAR_CHANNEL [NUCLEAR_CHANNEL ...]]
                         [--membrane-image MEMBRANE_PATH]
                         [--membrane-channel MEMBRANE_CHANNEL [MEMBRANE_CHANNEL ...]]
                         [--image-mpp IMAGE_MPP] [--batch-size BATCH_SIZE]
                         [--compartment {nuclear,membrane,whole-cell}]
run_app.py mesmer: error: argument --compartment/-c: invalid choice: 'both' (choose from 'nuclear', 'membrane', 'whole-cell')

While --compartment "membrane" fails with:

Traceback (most recent call last):
  File "/usr/src/app/run_app.py", line 180, in <module>
    output = app.predict(image, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/deepcell/applications/mesmer.py", line 310, in predict
    postprocess_kwargs=postprocess_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/deepcell/applications/application.py", line 448, in _predict_segmentation
    label_image = self._postprocess(output_images, **postprocess_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/deepcell/applications/application.py", line 215, in _postprocess
    image = self.postprocessing_fn(image, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/deepcell/applications/mesmer.py", line 135, in mesmer_postprocess
    'Must be one of {}'.format(compartment, valid_compartments))
ValueError: Invalid compartment supplied: membrane. Must be one of ['whole-cell', 'nuclear', 'both']

Happy to provide more details if helpful.

A Docker image without ENTRYPOINT

Would it be possible to drop ENTRYPOINT from the Dockerfile, or alternatively build a parallel image without it?

When wrapped by Nextflow, the container is executed as:

docker run -i --cpus 1.0 -v /workspace:/workspace -w "$PWD" --name $NXF_BOXID vanvalenlab/deepcell-applications:0.4.0 /bin/bash ...

which a) resets the work directory and b) runs the command through /bin/bash. The former results in:

python: can't open file 'run_app.py': [Errno 2] No such file or directory

because the ENTRYPOINT directive doesn't specify the full path and relies on WORKDIR, which gets overwritten at runtime. If the ENTRYPOINT is modified to specify the full path to run_app.py, the /bin/bash wrapper then creates the following problem:

  usage: run_app.py [-h] {mesmer} ...
  run_app.py: error: argument app: invalid choice: '/bin/bash' (choose from 'mesmer')

My understanding is that Nextflow is not unique in this, and other workflow languages would also have issues with the entrypoint.

Add support for multi-channel .ome.tiff

Consider adding support for multi-channel .ome.tiff, by re-parameterizing
--nuclear-image <file.tif> --membrane-image <file.tif>
to
--image <file.ome.tif> --nuclear-channel <int> --membrane-channel <int>

As a possible feature, allow for multiple indices to be specified, renaming the parameters to --nuclear-channels and --membrane-channels accordingly.

Example code for reading .ome.tiff and other formats:
https://github.com/HMS-IDAC/UnMicst/blob/340a31de8873fa247ea502e778ef14acee56984e/UnMicst2.py#L754-L764

Improvements to the application

Summary of our discussion on next steps for making the application more useful.

Right now we only have Mesmer-specific code in the file. Do we want to create a separate docker for Mesmer, with a more targeted name to indicate its focused scope? Or do we want to work to generalize this so that there will be a CLI interface to the other applications which already exist in deepcell-tf, such as the tracking model, cell culture model, etc.
We currently require prospective users to build the docker themselves. I think it would be more user friendly if we built and tagged the docker, and then hosted it on dockerhub, allowing users to run it.
We don't currently test the scripts in this repo, since they were initially quite minimal. As we start adding functionality, I think we should add CI/CD support, since we already encountered an issue of stale code breaking the script

Batch processing

It's unclear how this currently implements batch processing. I see the batch-size parameter, but passing in a multi-slice tiff (single channel, for nuclear segmentation) only processes the first slice. I currently have a pipeline where the entire process relaunches between slices; I assume this is inefficient.