fish-quant / big-fish Goto Github PK

View Code? Open in Web Editor NEW

55.0 8.0 18.0 13.21 MB

Toolbox for the analysis of smFISH images.

Home Page: https://big-fish.readthedocs.io/en/stable/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.01% Python 99.97% Shell 0.03%

pattern-classification rna-localization segmentation smfish spatial-features spot-detection

big-fish's People

Contributors

Stargazers

Watchers

Forkers

oeway muellerflorian marcnol tdefa rachel-kt marykthompson ebouilhol 4dnucleome hefv57 zwdiscover vvvanl jhuang-ptgg ewallace phanelthing redbeer18 deekshith8484 cpbach dhtc

big-fish's Issues

summarize_extraction_results: add option to specify delimiter

Would be nice to have the option to specify the delimiter in this function.

It calls save_data_to_csv here: https://github.com/fish-quant/big-fish/blob/master/bigfish/stack/postprocess.py#L559, which provides this option.

Getting started on RNAScope analysis

Hi I just installed Big-FISH but I am a bit lost on how to use it. I created the virtual environment with conda, activated it, and installed Big-FISH. I recently performed RNAScope on two genes. I have 3 channel z-stacks (DAPI, Cy3, Cy5) acquired on the Zeiss Z2 epifluorescence microscope and saved as .CZI files. How can I load these files into Big-FISH so that I can start the analysis?

Plot elbow, return threshold`

Hi,

Could you edit the plot elbow function so that it returns the detected optimal threshold in addition to the graph?

Different thresholding for nuclei and cytoplasm

Hi,

I am not sure for spot detection, can user set different thresholds for spots in nuclei and spots in cytoplasm?

Rename function for cluster decomposition and foci detection

We want to prevent the confusion between cluster decomposition step and clusters detection step (foci, transcription sites, etc).
Cluster decomposition -> dense decomposition
foci detection -> clusters detection

Allow subpixel spot detection

Apply gaussian fitting in 2D or 3D after spot detection to affine the coordinates with a subpixel accuracy.
Adapt pipeline to float coordinates.

[extract_cell] add identifier of cell label to fov_results

The function extract_cell allows combining spot/foci detection results and cell/nuclei segmentation.

Cell segmentation results are stored as a label image, where each segmented entity has a unique pixel value, which can serve as an identifier. It would be good to store this value in the cell_results, maybe as (index_cell) or something like that. This would allow to directly link cell_results with the label image.

Need to discriminate object inside the nucleus

We should add a generic function to tell if a specific coordinate is in the nucleus or not. It could be used with the spots, the foci, whatever.

Remove super-expressing cells from analysis?

Super-expressing cells are slowing down the analysis and I'm not confident that they are being quantified accurately.

Any ideas on how to remove these cells from the quantification within bigfish?

My thought would be to have a segment-first mode where you segmented the cells first, and then run spot detection on each cell individually. But I don't know how much this would slow things down.

The super-expressing cells could then be treated as one big cluster, and then the number of spots could be estimated based on integrated density of an average single spot.

Is this problem too obscure?

Pandas version

Hi all,

Thanks a lot for developing this great software. I wanted to ask if you are planning to update the version of pandas. I noticed that you are using pandas 0.24.0. Unfortunately, this version is quite old, and it is causing some issues when I try to integrate Big-Fish with other modules or when used in Google Colab.

Best,

Luis

plot.utilities.saveplot() not writing image

I'm trying to save an image of the spot detection using plot.utilities.saveplot, as in the screenshot below.

It writes a file, but no image.

Any ideas?
Thanks!
j

Set spot detection threshold based on signal:background

Is this already possible? Would be very helpful!
j

Manual Cell Segmentation

Is there a way within big-fish to manually segment cells, similar to the tool included in the Matlab based FISHQuant? The current methods of segmenting cells (U-net or watershed) are challenging to use for neurons, and does not allow me to segment cells the way I need to. Or, is there another program recommended to do manual segmentation in? Thanks!

Function to save a csv file with basic results from bigfish.stack.extract_cell

We should gather information from cell extraction in a dataframe like object and export in a a readable format.

read_array_from_csv: allow header rows

Might be useful to permit header rows in the results files when read from a csv file (to indicate what the columns contain) . To read those, a slight modification in the function would be sufficient, by allowing access to the skiprows parameter of numpy.loadtext.

Add SNR functions

Clarify external and internal cell coordinates

The function multistack.from_binary_to_coord() returns external boundary coordinates. It helps to build back the original mask with multistack.from_coord_to_surface(), but might lead to incoherence when the coordinate is directly processing. The user should be able to extract cell (and nucleus) coordinates with the external and internal boundaries according to its need.

Jupyter Notebook Tutorials won't launch

The notebooks pointed to by the link on the main page ( https://mybinder.org/v2/gh/fish-quant/fq-imjoy/binder?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Ffish-quant%252Fbig-fish-examples%26urlpath%3Dtree%252Fbig-fish-examples%252Fnotebooks%26branch%3Dmaster )
fail with this error:

Step 40/53 : RUN ${KERNEL_PYTHON_PREFIX}/bin/pip install --no-cache-dir -r "requirements.txt"
---> Running in 5b17aeb977fd
Collecting big-fish==0.6.2
Downloading big_fish-0.6.2-py3-none-any.whl (123 kB)
ERROR: Could not find a version that satisfies the requirement imjoy>=0.11.38 (from versions: 0.7.20, 0.7.22, 0.7.24, 0.7.25, 0.7.26, 0.7.28, 0.7.30, 0.7.31, 0.7.40, 0.7.41, 0.7.50, 0.7.51, 0.7.52, 0.7.53, 0.7.54, 0.7.56, 0.7.57, 0.7.58, 0.7.59, 0.7.60, 0.7.63, 0.7.64, 0.7.65, 0.7.66, 0.7.67, 0.7.68, 0.8.7, 0.8.8, 0.8.13, 0.8.14, 0.8.16, 0.8.17, 0.8.18, 0.8.19, 0.8.20, 0.8.21, 0.8.22, 0.9.0, 0.9.1, 0.9.2, 0.9.3, 0.9.4, 0.9.5, 0.9.6, 0.9.7, 0.9.9, 0.9.10, 0.9.11, 0.9.12, 0.10.0, 0.10.2, 0.10.3, 0.10.4, 0.10.5, 0.10.6, 0.10.7, 0.10.8, 0.10.9, 0.10.10, 0.10.12, 0.11.0, 0.11.1, 0.11.2, 0.11.3, 0.11.4, 0.11.5, 0.11.6, 0.11.7, 0.11.8, 0.11.9, 0.11.10, 0.11.11, 0.11.12, 0.11.13, 0.11.14, 0.11.15, 0.11.16, 0.11.17, 0.11.18, 0.11.20)
ERROR: No matching distribution found for imjoy>=0.11.38

Increase speed of spot detection

Compute time on our HPC ranges from 1-30 minutes, depending on the number of spots.

Any general advice on how to speed this up? I've split up the analysis using the multiprocessing module, but returns are rather limited.

Is there GPU implementation for bigfish?

Data and computer specs
2047x2047x20 px
100-10000 spots per image
(Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz) x 30 processors
115GB RAM

Also opening a separate issue to discuss the possibility of analyzing images by cell instead of whole frame, and analyzing super-expressing cells in analogue mode (or ignoring these cells to process separately).

Cheers,
Josh

FISH spot intensity analysis

How can I do intensity analysis of all the FISH spots or a selected group of those spots

Replace 'experience' by 'experiment'

Sorry for my french...

Allow coordinates in float for cell extraction functions

Integrate codecov in the repository

Integrate code coverage in the repository.

read_array_from_csv: add option to specify headers

The current function reads header-less csv files. To increase it's flexibility, would be nice to have an option to specify how many headers a files has.

https://github.com/fish-quant/big-fish/blob/master/bigfish/stack/io.py#L118

could be done with the skiprows option of the used np.loadtext: https://numpy.org/doc/1.20/reference/generated/numpy.loadtxt.html

decompose_cluster() inserting large number of invisible RNAs at edge of image

Hi, thanks for creating this great tool. I am really enjoying it.

I ran into some unusual behavior with one of my images at the decompose_cluster() step. I think it may be related to having a couple bright pixels at the very edge of the image. After decomposition, the number of spots assigned to z-slice 40 goes from 5 to 469, even though there obviously aren't many spots there. This region is then called as a focus in the detect_foci() step and ends up having >3-fold more RNAs assigned to it than any of the other foci (despite there being no visible focus in that region of the original image).

Here I’ve shown the original spots detected and I also plotted the spots in xyz with different foci shown in different colors. The strange focus with 469 ‘invisible’ RNAs is circled in red. Please let me know if there’s anything you think I should do to troubleshoot this more or if you have any ideas what might be causing it.

I put a notebook showing the analysis and the input image here:
https://github.com/marykthompson/temp_images

Thanks,
MK

Adapt centrosomal features in 3D

Implement a 3D version of the feature.

Changing the colour on plots

Hi, is there a way to change the colour scheme for the plots generated in Big-FISH. The heat maps currently generated aren't ideal for my purposes.

Sachin

[read_array_from_csv] add more flexibility when reading results

The current function reads csv files with a header and columns separated by ";".

Might be interesting to add some more flexibility.

Header: yes/no; how many lines
Different separator

Failed Segmentation

Hi,
Thanks for the useful tool!
I'm now analyzing our own data with big-FISH. When segmenting the nuclei and cytoplasm, I failed to see the nucleus in the picture, also failed to generate a line to label the boundary as the example shows but instead, I saw some dotted line. I feel that this is weird, and found one difference from the example is that the dtype of my nuc_label is int32 instead of int64, and I changed it with astype for watershed and plot, do you have any idea why I met this problem and how to solve this?

More about the result:
the pictures (nucleus segments)

(boundary segment )

and here are the codes I used :

nuc_mask = segmentation.thresholding(dapi_2d, threshold=6)
nuc_mask = segmentation.clean_segmentation(nuc_mask, small_object_size=50, fill_holes=True)
nuc_label = segmentation.label_instances(nuc_mask)

# apply watershed
cell_label = segmentation.cell_watershed(posS_2d, nuc_label.astype(np.int64), threshold=0.00001, alpha=0.00001)

plot.plot_segmentation(dapi_2d, nuc_label.astype(np.uint16), rescale=True, framesize=(15, 5))

Many thanks if you have any suggestions!
Iris

Estimating # of spots in a cluster

Is there an integrated density method to estimate the # of spots in foci, similar to FQ in Matlab?

I see nb_rna_in_foci = foci_coord[:, ndim].sum() in the features.foci_features() function. My understanding is that it works by counting up the number of spots in the foci ROI. Is that correct, and is that the only method available?

Sorry if I am overlooking something obvious.
Thanks,
j

Effective error message when no spots are (pre) detected

I've accidentally entered mismatching values for voxel size and psf values to detection.detect_spots, resulting in no spots detected. The error message I got was the default:
index -1 is out of bounds for axis 0 with size 0

I think it might be helpful to tell the user that no spots were detected, and maybe they should check the input.

Values entered (for my image of simulated points):

voxel_size_yx = 1
voxel_size_z = 1
psf_yx = 4
psf_z = 4

Otherwise, it's a great package! Very helpful.
Thanks!

Add plot spot-threshold

Add a function to display the L-curve spot-threshold.

Is there a way outside of visual inspection to set threshold value for watershed segementation?

Is this possible?

Cell segmentation in 3-dimensions?

Cells at different z-planes whose nuclei overlap in the two-dimensional projection are interpreted as a single cell. Is there any way to do cell segmentation and nucleus mask generation in 3 dimensions to avoid this issue. See the image attached.

Sharpness measure goes below 1

Hello, I noticed that some elements in the 'sharpness measure' computed with stack.compute_focus were below 1, and this can't be by its definition. I tested the original code in bigfish/stack/quality.py and realised that ratio_1 is overwritten after ratio_2 = np.divide(), resulting in elements < 1 in ratio_1 where original image > filtered image. I guess this is because ratio_1 is somehow connected with ratio_default that is overwritten when ratio_2 is assigned with np.divide(..., out = ratio_default). One simple workaround would be to have two distinct ratio_default for each ratio_1 and ratio_2... (?)

Refactor bigfish.deep_learning

Refactor bigfish.deep_learning and update examples with pre-trained deep learning models.

Cell Segmentation - Cell Information Extraction

Hello, I have been using the cellpose algorithm for cell and nuclei segmentation. The segmentation appears to work well, however when I proceed to perform cell information extraction. It severely underrepresents the number of cells present. For instance, for the image provided it says only 12 cells are detected.

Refactor bigfish.classification

Refactor bigfish.classification and add examples in a new notebook.

[extract_cell] add sub-cellular position of isolated RNAs

Currently, we don't report if an RNA is in the cytoplasm or nucleus. However, this information might be useful for different biological questions.

Add parameters in bigfish.detection.decompose_cluster

Add parameters to control number of spots decomposed per cluster and the number of clusters detected.

Add a sphinx documentation

Sphinx Sphinx Sphinx !!!

v.0.6.1 thresholding type error

Using big-fish 0.6.1 and running the following code from your example notebook

spots, threshold = detection.detect_spots(
    images=rna, 
    return_threshold=True, 
    voxel_size=(300, 103, 103),  # in nanometer (one value per dimension zyx)
    spot_radius=(300, 150, 150))  # in nanometer (one value per dimension zyx)

Generates the following error:

TypeError                                 Traceback (most recent call last)

File ~\anaconda3\envs\fish\lib\site-packages\bigfish\detection\spot_detection.py:516, in automated_threshold_setting(image, mask_local_max)
    514 # select threshold where the break of the distribution is located
    515 if count_spots.size > 0:
--> 516     optimal_threshold, _, _ = get_breaking_point(thresholds, count_spots)
    518 # case where no spots were detected
    519 else:
    520     optimal_threshold = None

File ~\anaconda3\envs\fish\lib\site-packages\bigfish\detection\utils.py:668, in get_breaking_point(x, y)
    645 """Select the x-axis value where a L-curve has a kink.
    646 
    647 Assuming a L-curve from A to B, the 'breaking_point' is the more distant
   (...)
    665 
    666 """
    667 # check parameters
--> 668 stack.check_array(x, ndim=1, dtype=[np.float64, np.int64])
    669 stack.check_array(y, ndim=1, dtype=[np.float64, np.int64])
    671 # select threshold where curve break

File ~\anaconda3\envs\fish\lib\site-packages\bigfish\stack\utils.py:135, in check_array(array, ndim, dtype, allow_nan)
    133 # check the dtype
    134 if dtype is not None:
--> 135     _check_dtype_array(array, dtype)
    137 # check the number of dimension
    138 if ndim is not None:

File ~\anaconda3\envs\fish\lib\site-packages\bigfish\stack\utils.py:172, in _check_dtype_array(array, dtype)
    169         break
    171 if error:
--> 172     raise TypeError("{0} is not supported yet. Use one of those dtypes "
    173                     "instead: {1}.".format(array.dtype, dtype))

TypeError: int32 is not supported yet. Use one of those dtypes instead: [<class 'numpy.float64'>, <class 'numpy.int64'>].

The error was suppressed by changing line 546 of spot_detection.py

from: thresholds = np.array(thresholds)
to: thresholds = np.array(thresholds, dtype=np.int64)

Calculating the PSF_X, PSF_Z values

Hi,

How would I go about calculating the standard deviation for the PSF required by the automated spot detection function as input?

Sachin

Stop pinning requirements in setup.py, use test to check compatybility

Package which is marked as stable should not pin exact version of requirements.
Mature python project does not break backward compatibility until increase main version number.
any code which will work in matplotlib 3.0.2 should work with any from 3.x.x. So markers should be >=3.0.2,<4 \

Also other packages are pinned to old versions. This force use to create separate environment to use big-fish or force them to edit sources to change == to >=.

Only pandas may be problematic because they break few things when bump to 1.x.x line. But this line introduce also many improvements and force to use 0.24.0 may be to big pay to use big-fish in many projects (for example numpy 1.16.0 cannot be simple installed with python 3.8 and 3.9).

Github provides amazing tool fo open source projects which is named Github Actions and allow to automatically run even daily test to have information about any compatibility problem.

SNR calculation is very slow with many spots and larger images

The current implementation of the SNR calculation is very slow for full -size images and many spots.

Case example: images were 2024x2024x30 with 10K spots. SNR computation was not finished after 1h.

A more brute-force method looping over all spots was substantially faster (see below). Maybe we can add this approach as an alternative and let the user decide which implementation to use?

Also, I think it might be informative to further return the values for signal, noise, and background. These might be useful for users to judge other aspects of their smFISH signal quality.

def compute_snr_per_spot_3d(image, spot, crop_spot, w_bgd):
    """Extract a 3-d volume around a  spot. From this volume, 
    signal (I) is determined as the maximum pixel intensity. The outer most
    layer(s) are used to calculate background (B) as the mean intensity,
    and noise (N) as the standard deviation. The signal-to-noise ratio (SNR)
    is then calculated as SNR = (I-B)/N for each spot.

    Parameters
    ----------
    image : np.ndarray
        Image with shape (z, y, x).
    spot : np.ndarray, np.int64
        Coordinate of a spot, with shape (1, 3). One coordinate per dimension
        (zyx coordinates).
    crop_spot : Tuple, np.int64
        Size of cropping area to which analysis will be restricted to. Defined 
        around each detected spot, one scalar per dimension (z, y, x).
    w_bgd : Tuple, np.int64
        Size of outer pixel layer of cropped spot which will be used
        to determine background.(z, y, x).

    Returns
    -------
    snr : float
        Signal-to-Noise ratio of the spot.
    signal : np.int64
        Maximum spot intensity.
    background : np.int64
        Mean of background.
    noise : float
        Standard deviation of background.

    """

    # get spot coordinate
    spot_z, spot_y, spot_x = spot

    # get spot and background radii
    crop_z, crop_yx, crop_yx = crop_spot
    w_bgd_z, w_bgd_y, w_bgd_x = w_bgd

    # crop a volume around spot
    z_min = max(0, int(spot_z - crop_z))
    z_max = min(image.shape[0], int(spot_z + crop_z))
    y_min = max(0, int(spot_y - crop_yx))
    y_max = min(image.shape[1], int(spot_y + crop_yx))
    x_min = max(0, int(spot_x - crop_yx))
    x_max = min(image.shape[2], int(spot_x + crop_yx))

    # Crop image around detected position
    image_crop = image[z_min:z_max + 1,
                       y_min:y_max + 1,
                       x_min:x_max + 1].copy().astype(np.float64)

    # Measure signal as maxim value in this image
    signal = image_crop.max()

    # Extract image background as outermost layers of cropped image
    image_crop[w_bgd_z:-w_bgd_z, w_bgd_y:-w_bgd_y, w_bgd_x:-w_bgd_x] = -1.
    image_background = image_crop[image_crop > 0]

    # compute background and noise
    background = image_background.mean()
    noise = max(image_background.std(), 10e-6)

    # compute SNR
    snr = max(0, (signal - background)) / noise

    return snr, signal, background, noise

Add cell label in bigfish.stack.extract_cell

Add cell label while extracting cell level results.

Fail to import multistack module

Hi, I got a problem to import the module of multistack when trying to run the first Jupyter notebook example (my python version is 3.9.7) :

ModuleNotFoundError Traceback (most recent call last)
/var/folders/45/8191dpy56n373_4kt1cgwb2w0000gn/T/ipykernel_35809/740557443.py in
2 import bigfish
3 import bigfish.stack as stack
----> 4 import bigfish.multistack as multistack
5 print("Big-FISH version: {0}".format(bigfish.version))

ModuleNotFoundError: No module named 'bigfish.multistack'

Integrate travis in the repository

Integrate continuous integration in the repository.

Example data does not download on windows

When running check_input_data(path_input) on Windows, the example data is not downloaded.

The following error is shown

downloading experience_1_dapi_fov_1.tif...
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
 in 
      4 
      5 # check input images are loaded
----> 6 check_input_data(path_input)

~\Anaconda3\envs\fq-imjoy\lib\site-packages\data\utils.py in check_input_data(input_directory)
     56         stack.load_and_save_url(url_input_dapi,
     57                                 input_directory,
---> 58                                 filename_input_dapi)
     59         stack.check_hash(path, hash_input_dapi)
     60 

~\Anaconda3\envs\fq-imjoy\lib\site-packages\bigfish\stack\utils.py in load_and_save_url(remote_url, directory, filename)
    676 
    677     # download and save data
--> 678     urlretrieve(remote_url, path)
    679 
    680     return

~\Anaconda3\envs\fq-imjoy\lib\urllib\request.py in urlretrieve(url, filename, reporthook, data)
    255         # Handle temporary file setup.
    256         if filename:
--> 257             tfp = open(filename, 'wb')
    258         else:
    259             tfp = tempfile.NamedTemporaryFile(delete=False)

FileNotFoundError: [Errno 2] No such file or directory: 'D:\\Work\\Documents\\data-test\\fish-quant\\big-fish\\data\\input\\experience_1_dapi_fov_1.tif'

Threshold setting does not target elbow in plot

Hello,

I'm trying to use the automated threshold detection feature of bigfish. However, I find that the threshold is constantly estimated to be too high, leading to a lot of missed spots.

When looking at the elbow plot, you can clearly see the elbow to the left of the set threshold, and the selected threshold does not actually lie on top of any bend in the line. Any idea what could be the reason for this and how to improve the threshold setting?

Thanks for your help!