Giter Site home page Giter Site logo

spacetx-research's People

Contributors

ambrosejcarr avatar dalessioluca avatar mbabadi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

spacetx-research's Issues

things to test

possible issues:

  1. GP prior, is quite far from the posterior (I see that in the KL_LOGIT being large...)
  2. encoder/decoder are too simple (increase zdim and maybe normalize by its running average)
  3. different betas for Adam since the batch_size is small

Sparsity is in range but BB are not tight.....
Maybe when increasing KL I should also increase SPARSITY

Maybe sparsity should be multiplied by balance AND sparsity term.....

memory efficient implementation with many instances

right now big_mask and big_imgs have the spatial extension of the original image and an extra dimension for the number of instances. This will kill the memory for large images containing many cells. Is there a way to get away with this? It is interesting to note that the in the only the combination sum_j p_j m_j then I can drastically reduce the memory usage. What about imgs? Is the same thing true, i.e. sum_j p_j img_j?

background of arbitrary complexity

You should be able to specify the complexity of the background and/or whether to use it or not.
Sometimes the background is so complex that there is no signal for the cells

robustness to different inputs size

need to adjust initial value of length_scale_similarity based on the intercellular distance
need to initialize lambda (parameter used to keep running average of kl_logit) in some reasonable way
use only adaptive_avg_pool_2D or adaptive_max_pool_2D

MEMO OF THINGS TO DO

  1. produce one large segmentation of smFISH_OLEH and VISIUM
  2. check batch_norm in UNET (currently is not there)
  3. reflection padding in UNET (currently is not there) or nothing and then prediction on a smaller region?
  4. the graph is not a K_NN graph. is that ok? Optimize the radius. It seems that larger is better (i.e. 5 is better than 2). To evaluate this systematically you need to make plots of N_OBJECTS vs RESOLUTION parameters. Hopefully for large radius we will see a plateau
  5. is greedy modularity optimization the thing we are interested in? TIM suggests: If you aren’t committed to greedy modularity maximization, one of the fastest libraries that will get you community detection (using Stochastic Block Models) is graph tool (https://graph-tool.skewed.de/). It’s c++ underneath (using Boost I believe), so it is very fast. The tradeoff is that it can be a huge pain in the ass to install, though I have heard it has recently been simplified.
  6. the graph is partitioned in disconnected components. Is there an advantage in treating each connected component separately. Is community detection faster? Can I use the same resolution parameters for all the different disconnected components
  7. loss function optimization. It seems that the best loss function was the one in
    folder: /home/jupyter/REPOS/spacetx-research/NEW_ARCHIVE/merfish_june22_v2
    commit 39d6bf2
    Change master implementation back to that one. Try to understand the differences.
  8. can i reduce operation for the creation of the graph to 1/4 by using roller2d on just one quadrant?

work with partial annotation

If few images are partially annotated then you can do supervised learning.
I would think that given an integer_mask_annotation:

  1. compute target bounding boxes, centroid and width/height (using skimage)
  2. identify which voxel is responsible for each target bounding box.
  3. add a regression loss between the target bounding box and the infered bounding box (i.e. tx_map, ty_map, tw_map, th_map which are all in (0,1)). Note that only few voxel will be "labelled" therefore the regression loss should be "masked".
  4. All location inside the target bounding box should have a loss between p_map and target probability. The target probability is 1 at the center of the bounding box and zero in all the other location of the bounding box, i.e. the probability is both pushed up (at center) and down (at periphery)
  5. identifies the bb with the largest IoU with the target bounding box. For that bb put a cross entropy classification loss between the inferred and target mask

Note:
For most images there would not be any annotation, even the annotation is present it is only partial. Therefore the code need to be written in such a way that this labelled loss defaults to zero in most cases.

analyze real datasets

Tommaso Biancalani and Alma Andersson check VIsium Data

AGENDA (5/8/20)

  1. Complete review of outline of manuscript
    ---(Updated for accuracy)
  2. Discussion of how to integrate Visium/slide-seq
  3. Review of other tasks
    ---Additional annotations for final data sets?
    ---Segmentation group status?
    ---Visualization demo (this week or next week?)

    From Eeshit Dhaval Vaishnav to Me: (Privately) (2:03 PM)
    
aah nice to see the lake again

    From Richard Scheuermann to Everyone: (2:05 PM)
    
Has the segmentation been finalized?

    From Eeshit Dhaval Vaishnav to Everyone: (2:40 PM)
    
For comparing segmentation results, the F1 score, Dice Index and Hausdorff distance would be good metrics . ( I have used them before and have code for computing each , lmk if that is helpful during the segmentation comparison stage )

    From Me to Everyone: (2:48 PM)
    
What is the required input?Share the script for the comparison.
Thanks!

Improve GECO

what is the range of allowed value fo the GECO hyper-parameters (0; + infinity) ?
change of hyper-parameters should be proportional to the distance to the target

strategy to deal with high-resolution images

the feature map can be from some lower_level.
the unet can go down all the way to 1x1 so that I can extract the background easily.
the sliding_window can be smaller so that only 3, 4 cells are in it (reducing N_BOX will be computationally convenient)
Unet can use dilation to deal with large images.....

to do october

  1. geco parameters trajectory should save only the image not the chart since it does not work
  2. run a simulation using cromwell
  3. merge to master
  4. make branch of master called experiment
  5. start experimenting (remember to save the source code in neptune)
  6. visualize chart comparison in neptune (learn). Coordinate plot

WHAT I AM LEARNING

  1. Informative latent space (with clusters) is antithetical with a generator which works taking N(0,1) which needs structureless latent-space.

  2. sigma should be chosen so that reconstruction is of order 1 (and therefore balanced with the rest of the term). A simple way to do it is: sigma2 = (x-x.mean()).pow(2).mean()

  3. At that point, all lambda terms can be between 0 and 5

  4. RECONSTRUCTION IS ALWATS ON. If in range do not change lambda. If out_of_range change lambda up or down. Lambda is clamped to [0.1, 10]

  5. SPARSITY IS ALWAYS ON: If in range do nothing. If out_of_range change lambda up or down. Lambda is clamped to [-10, 10]. The negative part is to get out of the empty solution if necessary.

  6. User should provide a fg_mask which can be easily obtained by Otsu or other thresholding methods

Overlap immediately pushes the fg_fraction to zero. That makes sense since at the beginning y_k < 0.5 and y_k(1-y_k) is minimized pushing all y_k to zero. Is there any incentives in learning non-overlapping instances (via KL) is there is no overlap?
I should reintroduce overlap as computed in terms of no-self interaction

  • READ PAPERS ABOUT HOW THEY DO DYNAMICAL REGULARIZATION
  • I COULD CROP THE FEATURE MAP AT THE LEVEL OF THE PGRID B/C THE POINT IS THAT THE INTERACTION IS DISCOVERED AT THE COARSER LEVEL (SIMILAR TO MASK-R_CNN)
  • power of methods would come from:
    --> combining dots (like Baysor)
    --> graph consensus
  • BACKGROUND LATENT CODE CAN BE 5x5. That way I can probably describe the spreading spreading I see in MERFISH

If reconstruction is in range do nothing. when parameters are in range I should not change them, i.e. change g = min(x-x_min, x_max -x) to g = min(x-x_min, x_max -x).clamp(max=0)
4. sparsity should always be on.

OLD:
3. in reconstruction is high it overcomes the sparsity term and the overlap term -> therefore lambda_rec need to multiply everything)

conditional dependence between z_what and z_mask

Right now in the generative model we draw z_what and z_mask independently from each other. This is clearly bad. We should have conditional dependence between z_what and z_mask.

IN THE PRIOR:

  1. z_mask ~ N(0,1)
  2. mu, sigma = MLP(z_mask)
  3. z_what ~ N(mu,sigma)
    In this way we achieve conditional dependence:
    p_prior(z_what, z_mask) = p_prior(z_what | z_mask) p_prior(z_mask)

IN THE POSTERIOR

  1. decode z_mask to mask
  2. use mask to crop the raw image
  3. encode the masked image in z_what
    In this way we achieve p_posterior(z_what, z_maks) = p_posterior(z_what|z_maks)p_posterio(z_mask)

increase size of the cropped region

Right now the cropped region is 28x28 which might be too small for segmentation.
Maybe high resolution is necessary for good reconstruction (which we do not care about) but it is not necessary for segmentation (which we care about)

IDEA TO CHECKS

  1. use higher resolution encoder/decoder (see commit: de168fa )
  2. Crop directly the image instead of the feature map?

work in 3D

Asking the model to segment object starting from images is a crazy request.
It should not be possible.
It it only possible if you have a richer dataset, such as:

  1. movie
  2. same scene from different point of view
    By the way, predicting one z-slice from a different z-slice might be a good a approach

This means that we need to work in 3D

generate large image?

Is there a way to generate a large image or should always work with small patches and glue them together? Large image is nicer b/c generated pattern would show realistic variation and no boundary effects

to do tomorrow

Merge 60 into 55 and then 55 into master

USE THE DATALOADER I HAVE.
JUST SAVE ON CPU AND LOAD TO GPU EVERY BATCH
put the result of the tiling function on CPU if necessary

LOAD CKPT FROM HERE USE PRETRAINED TO OBTAIN GOOD SEGMENTATION
ld-results-bucket/merfish_june25_v7

graphclustering 65
#TODO: Compute median density of connected components so that resolution parameter is about 1
self.reference_density = AUCH

NAMEDTUPLE 151

#TODO: this might be too slow. Eliminate torch.bincount.
new_dict = self.params
new_dict["filter_by_size"] = (min_size, max_size)
new_membership = old_2_new[self.membership]
return self._replace(membership=new_membership, params=new_dict, sizes=torch.bincount(new_membership))

problem at TEST time

there is a problem at TEST time.
It is probably related to batch normalization.
I don't like BN since it has different behavior at train and test time.
Double check pytorch setting.....

Screen Shot 2020-06-15 at 7 28 16 PM

new graphical model

implement new graphical model where:

  1. in generator zwhat is conditioned on zmask

KL_logit

  • attach KL_logit to KL_total

  • Make sure that the scale of kl_logit is not too big

  • Should I learn the length scale of the gaussian kernel? Probably yes

Prior for the probability is wrong

I have observed that:

  1. even after long training the probability map that is generated by my prior (with learnable parameters) is very different from the inferred probability map given the data.
  2. this leads to the KL(posterior|prior) being very large
  3. moreover the learnable parameters of the prior change very slowly.

For all these reasons, I know that my prior is wrong, i.e. it is not flexible enough to capture the data.

First approach:

  1. have a more general kernel, K=k1+k2+k3
  2. logit = GP(K)
  3. p = sigmoid(a x + b)

Even in this approach I should decide whether to use straight-trough sigmoid or not???
The advanrage of this apporach is that KL between gaussian posterior and logit prior is analytically known.

Second approach -> DPP:

  1. The inference gives me: p = sigmoid(x)

  2. Do straight Bernoulli, i.e. c=0,1 but the gradient will see the probability.

  3. The prior is DPP, i.e. use a general kernel, K=k1+k2+k3 which will be my covariance matrix

  4. compute log_p by doing det(K(c))/det(K+I)

  5. Probably at each time step you need to do a cholesky decomposition

  6. use DPP. To compute the KL divergence I just need to compute log_P =

to do when coming back from vacation

I have to:

  1. Increase encoder from 28x28 to 56x56
  2. encode the background in a small latent variable
  3. change factor_balance_range: [0.1, 0.8, 0.9]
  4. Monitor the 3 different term in sparsity
  5. Since I have sparsity (which constraint the number of pixels) maybe I can just sum the KL together without rescaling?!

  1. the real deal is to use multi-objective optimization instead of GECO
  2. test that encoder/decoder are powerfull enough. For that I can create a dataset of isolated cells

  1. use Neptune
  2. reduce channels after UNET to 2 (one should be the original image, the other something else)

data loader from fast.ai

I need a dataloader from fast.ai with all the nice things like.
from_folder, data_augmentation, transform_y etc...

FIX MOVIES in MAIN

Fix movies. Do not use
import moviepy.editor as mpy
use
from matplotlib import animation

tiling to analyze large images

def tiling(img, crop_w, crop_h, stride_w, strid_h):
n_obj = 0
integer_segmentation_mask = torch.zeros_like(img)
vae.eval()
for crop in crops:
out = vae.forward(crop)
segmentation = out.inference.integer_segmentation_mask

For each crop it gets the integer_segmentation_mask which:

  1. is censured outside the region which do not suffers of the boundary effect (how do I do this?)
  2. shifted by the number of instances already found, i.e. mask = torch.where(mask > 0, mask+shift, 0)
  3. pasted in the right place

poission-gaussian observation model

In fluorescent microscopy the observation model should be poisson-gaussian since the brighter a pixel the more photon the more poisson noise. Things get complicated b/c CCD cameras have offsets. Porbavbly the parameter of the observation model (constant term, linear term and offset) should be learned. Ask Mehrtash and Tianle

Unet from fast AI without boundary effect

Current UNET sucks!
Use pretrained unet similar to fast.ai in which arc is resnet34.
Add hooks for skipped connection and to attached the 3 heads
keep track of the region without boundary effect (i.e. do not do padding)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.