mateuszbuda / brain-segmentation-pytorch Goto Github PK

U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI

Home Page: https://mateuszbuda.github.io/2017/12/01/brainseg.html

License: MIT License

Dockerfile 2.78% Python 97.22%

brain-segmentation-pytorch's Introduction

U-Net for brain segmentation

U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI based on a deep learning segmentation algorithm used in Association of genomic subtypes of lower-grade gliomas with shape features automatically extracted by a deep learning algorithm.

This repository is an all Python port of official MATLAB/Keras implementation in brain-segmentation. Weights for trained models are provided and can be used for inference or fine-tuning on a different dataset. If you use code or weights shared in this repository, please consider citing:

@article{buda2019association,
  title={Association of genomic subtypes of lower-grade gliomas with shape features automatically extracted by a deep learning algorithm},
  author={Buda, Mateusz and Saha, Ashirbani and Mazurowski, Maciej A},
  journal={Computers in Biology and Medicine},
  volume={109},
  year={2019},
  publisher={Elsevier},
  doi={10.1016/j.compbiomed.2019.05.002}
}

docker

docker build -t brainseg .

nvidia-docker run --rm --shm-size 8G -it -v `pwd`:/workspace brainseg

PyTorch Hub

Loading model using PyTorch Hub: pytorch.org/hub/mateuszbuda_brain-segmentation-pytorch_unet

import torch
model = torch.hub.load('mateuszbuda/brain-segmentation-pytorch', 'unet',
    in_channels=3, out_channels=1, init_features=32, pretrained=True)

data

Dataset used for development and evaluation was made publicly available on Kaggle: kaggle.com/mateuszbuda/lgg-mri-segmentation. It contains MR images from TCIA LGG collection with segmentation masks approved by a board-certified radiologist at Duke University.

model

A segmentation model implemented in this repository is U-Net as described in Association of genomic subtypes of lower-grade gliomas with shape features automatically extracted by a deep learning algorithm with added batch normalization.

results


94% DSC	91% DSC	89% DSC

Qualitative results for validation cases from three different institutions with DSC of 94%, 91%, and 89%. Green outlines correspond to ground truth and red to model predictions. Images show FLAIR modality after preprocessing.

Distribution of DSC for 10 randomly selected validation cases. The red vertical line corresponds to mean DSC (91%) and the green one to median DSC (92%). Results may be biased since model selection was based on the mean DSC on these validation cases.

inference

Download and extract the dataset from Kaggle.
Run docker container.
Run inference.py script with specified paths to weights and images. Trained weights for input images of size 256x256 are provided in ./weights/unet.pt file. For more options and help run: python3 inference.py --help.

train

Download and extract the dataset from Kaggle.
Run docker container.
Run train.py script. Default path to images is ./kaggle_3m. For more options and help run: python3 train.py --help.

Training can be also run using Kaggle kernel shared together with the dataset: kaggle.com/mateuszbuda/brain-segmentation-pytorch. Due to memory limitations for Kaggle kernels, input images are of size 224x224 instead of 256x256.

Running this code on a custom dataset would likely require adjustments in dataset.py. Should you need help with this, just open an issue.

TensorRT inference

If you want to run the model inference with TensorRT runtime, here is a blog post from Nvidia that covers this: Speeding Up Deep Learning Inference Using TensorRT.

brain-segmentation-pytorch's People

Contributors

Stargazers

Watchers

Forkers

zhixiongzuo ailzhang anuprulez jdc08161063 nunofernandes-plight albertswiecicki soumith nik-sm lipanr delldu jonathanchiang kartikmehta09 prasad3000 qitingshe milkigit monjoybme yejg2017 ugenberg hugues-talbot ramstein emrecanaltinsoy mantianlong jholee salamm1 kmhatre14 yuv4r4j anqiangl qinchengzhang ajinkyajawale14499 bhavesh2712 sumitkrmahato kirimaru-jp pennycoder5704 yesjuhyeong brainice muhendis bubblyyi tlwzzy gisuhwang0312 temitopeoladokun longfei-zhou guang000 xwjbupt imanofsteel rellasirisha hangxueliu diluculo searobbersduck xiaorui531 alexkoz prevalenter qianyeyang zeigar chris1992212 iscoelacanth cinsdikici willytell hjc3613 papers-implementation carpedkm erfandarzi fabricecarles ujasmandavia freegliboracle myboyliu sudhirsilwal23 dr-alok-tiwari saeedseyyedi einrone seormin hq01 blackjack2015 qiustander zhangmingyang-su pkusnail yeongkwoncho mjasng koide-lab marciavaz hamid-naderi miaotian acpuche ahatamiz piechaczekmyller mahdiesrafili medical-projects zhaijunyu zhouqp631 genhao3 holliemin9090 abnormall luzhaoxin hreynaud taturabe trendingtechnology songyuqing-cloud on-board-cruise timothy22000 zhongyi80 tss0503

brain-segmentation-pytorch's Issues

Error

When I ran the code using the kaggle_3m dataset, after 2 epochs, the following error occured.

Traceback (most recent call last):
File "D:\UNet_from_github\3_brain-segmentation-pytorch-master\train.py", line 275, in
main(args)
File "D:\UNet_from_github\3_brain-segmentation-pytorch-master\train.py", line 112, in main
dsc_per_volume(
File "D:\UNet_from_github\3_brain-segmentation-pytorch-master\train.py", line 175, in dsc_per_volume
dsc_list.append(dsc(y_pred, y_true))
File "D:\UNet_from_github\3_brain-segmentation-pytorch-master\utils.py", line 11, in dsc
File "D:\Anaconda3\envs\pytorch\lib\site-packages\medpy\filter\binary.py", line 112, in largest_connected_component
largest_component_idx = numpy.argmax(component_sizes) + 1
File "<array_function internals>", line 180, in argmax
File "D:\Anaconda3\envs\pytorch\lib\site-packages\numpy\core\fromnumeric.py", line 1216, in argmax
return _wrapfunc(a, 'argmax', axis=axis, out=out, **kwds)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\numpy\core\fromnumeric.py", line 54, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\numpy\core\fromnumeric.py", line 43, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
ValueError: attempt to get argmax of an empty sequence

I can not find solution around the Internet. Could you help me?

Can I run the code on test folder without the masks??

IoU of the model

Hi, thank you for your great works!
I am doing a research and want to compare the IoU of several models on LGG dataset. Could you please give the the IoU result of your model?
Thank you!

How can I run the docker container if my GPU isn't nvidia

Hello, I don't know much about docker, I'm learning how to use it now, so I'm sorry if my question is obvious or stupid. My GPU isn't NVIDIA, so how do I run the docker container?

where do you implement concatenation

Hi,

From the image of the architecture in README.md, it seems that a concatenation was implemented. But from your unet.py, I couldn't find where do concatenation was implemented? I have trouble implementing the concatenation in U-net, Could you please help me understand it?

Great Thanks.

Negative loss value

When I train this model with the default run command python train.py

I see negative loss values and 0 Dice coefficient value

loss for step: 622 = [-0.9068316221237183]
loss for step: 623 = [-0.9329317808151245]
loss for step: 624 = [-0.9376015663146973]
Best validation mean DSC: 0.000000

This code is using Torch 1.10 on Nvidia GPU with CUDA 11.
Is there something that needs fixing in the code (dataloader etc.)?

how to train on my dataset

Example on Colab not segmenting the tumor

Hi, and thanks for sharing your work! I have run the notebook you shared on PyTorch Hub and added one cell:

import scipy.ndimage as ndi
import matplotlib.pyplot as plt

pred = output[0, 0].cpu().numpy()
pred_bin = pred > 0.5
pred -= pred.min()
pred /= pred.max()
pred *= 255;
borders = ndi.binary_dilation(pred_bin) ^ pred_bin
input_array = np.array(input_image).copy()
input_array[borders] = input_array.max()

fig, axes = plt.subplots(1, 3, figsize=(12, 8))
axes[0].imshow(input_image)
axes[1].imshow(pred)
axes[2].imshow(input_array)

The segmentation is very inaccurate. Am I doing something wrong?

After UNet Inference, how to overlay / superimpose the different size predicted masks to the original image size?

Let us suppose we have an RGB image of 1024x720 and our Net inputs (and output mask) are of shape 512x512 with N classes.

So during training and inference, we need to convert our input image and mask to the desired shape of 512x512

But in real life scenario, when we are using it on Videos, images etc; we can't use the 512x512 image. Instead we need to use the original size. So how could we do this? How are we supposed to map / overlay / superimpose the predicted output mask of shape 512x512 to the input image of shape 1024x720 ?

google colab

can you please add a google colab for inference thanks
@mateuszbuda

IndexError: too many indices for array

Hi, I am testing your code and ended up with the following error. My test data has 11 patients with around 800 .tifs (400 images, 400 masks) per patient. I found out this error might because either I've given too many index values and my data probably not 2D. But I double-checked this is not the case.

from PIL import Image
import numpy
im = Image.open('case_00001_imaging_418.tif')
imarray = numpy.array(im)
imarray

imarray.shape
(100,523)

array([[-1006.0074 , -1008.0335 , -1002.4894 , ..., -1017.4314 ,
        -1023.156  , -1023.63153],
       [ -958.95374, -1015.2491 , -1019.2383 , ..., -1013.8681 ,
        -1013.6993 , -1022.19446],
       [-1012.0173 , -1010.1507 , -1000.2253 , ..., -1003.897  ,
        -1016.59955, -1019.79047],
       ...,
       [-1008.00464, -1015.87286, -1019.1767 , ..., -1000.86926,
        -1017.2503 , -1024.9915 ],
       [ -990.96545,  -999.8485 , -1001.993  , ..., -1014.1718 ,
        -1014.19946, -1024.5924 ],
       [ -994.9882 , -1003.7568 , -1014.58484, ..., -1024.8237 ,
        -1012.564  , -1020.57794]], dtype=float32)

$ train_validate()

reading train images...
preprocessing train volumes...
cropping train volumes...

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-28-c37cc2aa5294> in <module>()
----> 1 train_validate()

<ipython-input-27-a6322369edd2> in train_validate()
      2     device = torch.device("cpu" if not torch.cuda.is_available() else "cuda:0")
      3 
----> 4     loader_train, loader_valid = data_loaders(batch_size, workers, image_size, aug_scale, aug_angle)
      5     loaders = {"train": loader_train, "valid": loader_valid}
      6 

<ipython-input-17-69b23f09d135> in data_loaders(batch_size, workers, image_size, aug_scale, aug_angle)
      1 def data_loaders(batch_size, workers, image_size, aug_scale, aug_angle):
----> 2     dataset_train, dataset_valid = datasets("/labs/mpsnyder/gbogu17/kits_2019/kits19/data_tiff", image_size, aug_scale, aug_angle)
      3 
      4     def worker_init(worker_id):
      5         np.random.seed(42 + worker_id)

<ipython-input-18-ff0e5f728339> in datasets(images, image_size, aug_scale, aug_angle)
      4         subset="train",
      5         image_size=image_size,
----> 6         transform=transforms(scale=aug_scale, angle=aug_angle, flip_prob=0.5),
      7     )
      8     valid = BrainSegmentationDataset(

<ipython-input-7-f22e3a5b7539> in __init__(self, images_dir, transform, image_size, subset, random_sampling, seed)
     56         print("cropping {} volumes...".format(subset))
     57         # crop to smallest enclosing volume
---> 58         self.volumes = [crop_sample(v) for v in self.volumes]
     59 
     60         print("padding {} volumes...".format(subset))

<ipython-input-7-f22e3a5b7539> in <listcomp>(.0)
     56         print("cropping {} volumes...".format(subset))
     57         # crop to smallest enclosing volume
---> 58         self.volumes = [crop_sample(v) for v in self.volumes]
     59 
     60         print("padding {} volumes...".format(subset))

<ipython-input-3-e45584d890d1> in crop_sample(x)
     16     return (
     17         volume[z_min:z_max, y_min:y_max, x_min:x_max],
---> 18         mask[z_min:z_max, y_min:y_max, x_min:x_max],
     19     )

IndexError: too many indices for array

how can I apply other loss functions

I'm thankful sharing pretrained weights.. I have a question for using.

I want to use other losses(ex.BCE, focal) on training.(not dice loss.)

When I use other functions, loss didn't decrease from particular value, but dice loss didn't.

Do the pretrained weights fit on dice loss only?

manifest for nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04 not found: manifest unknown: manifest unknown

while trying to build dockerfile getting this message:

manifest for nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04 not found: manifest unknown: manifest unknown

in_channels parameter change causes size mismatch

When changing the kwarg (in_channels) from 3 to 2, like

torch.hub.load('mateuszbuda/brain-segmentation-pytorch', 'unet',
    in_channels=2, out_channels=1, init_features=32, pretrained=True)

this error occurs:

...module.py", line 1044, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for UNet:
        size mismatch for encoder1.enc1conv1.weight: copying a param with shape torch.Size([32, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 2, 3, 3]).

Training on a custom dataset

Hi,

I am trying to train the network on a small dataset of brain MRI images. I have sliced out the images and masks following the format of your dataset. Now I am having an issue during preprocessing train volumes which results in the volumes having shape (1,1,1) as I print from resize_sample function. I am guessing it is a problem caused by my slice dimensions but I didn't figure out how to fix it. It would be really helpful if you could help!

P.S. I can send you some sample slices I used, but cannot share it here due to the file size limit.

Very large images on my dataset

Hello,
I'm trying to train this on my own dataset, which contains very large images (7844 x 7786 for example). So, what I'm doing is slicing my images in 256x256 tiles and treating my large original images as your "patients". So my volumes arrays are like [860, 256,256,3], where 860 is the number of slices of an image. But I'm having memory problems when I try to create the dataset. When It gets to crop_sample or pad_sample functions my memory just can't handle. I'm trying to fit generator expressions in your dataset structure, but I'm not being very successful. I've never used generators before and might be doing something wrong, so I'm still working on it. But I would like to know if you have any suggestion on the matter or if you don't think your network will work with such big images.

ValueError: Sample larger than population or is negative

Hello,

Running python script cause to an error:

reading validation images...
Traceback (most recent call last):
  File "brain-segmentation-pytorch/inference.py", line 184, in <module>
    main(args)
  File "brain-segmentation-pytorch/inference.py", line 22, in main
    loader = data_loader(args)
  File "brain-segmentation-pytorch/inference.py", line 80, in data_loader
    random_sampling=True,
  File "/content/brain-segmentation-pytorch/dataset.py", line 56, in _init_
    validation_patients = random.sample(self.patients, k=validation_cases)
  File "/usr/lib/python3.6/random.py", line 320, in sample
    raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative

History of my commands in a Arch Linux system:

cd ~/OpenSource/
git clone https://github.com/mateuszbuda/brain-segmentation-pytorch
cd brain-segmentation-pytorch/
ls
pip3 install numpy
pip3 install torch
pip3 install matplotlib
python inference.py
pip3 install medpy
python inference.py
pip3 install skimage
pip3 install skit-image
pip3 install scikit-image
pip3 install tqdm
mkdir data
wget https://www.kaggle.com/kmader/mias-mammography && unzip...
python inference.py --images data --weights weights/unet.pt
python3 inference.py --images data --weights weights/unet.pt

[max@base brain-segmentation-pytorch]$ python -v

Python 3.8.5 (default, Sep  5 2020, 10:50:12)

[max@base brain-segmentation-pytorch]$ python inference.py --images data --weights weights/unet.pt

reading validation images...
Traceback (most recent call last):
  File "inference.py", line 184, in <module>
    main(args)
  File "inference.py", line 22, in main
    loader = data_loader(args)
  File "inference.py", line 76, in data_loader
    dataset = Dataset(
  File "/home/max/OpenSource/brain-segmentation-pytorch/dataset.py", line 56, in __init__
    validation_patients = random.sample(self.patients, k=validation_cases)
  File "/usr/lib/python3.8/random.py", line 363, in sample
    raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative

We face to same problem in Google CoLab:

Can you guide?

I change 56th line of dataset.py, from:

validation_patients = random.sample(self.patients, k=validation_cases)

to:

validation_patients = random.choices(self.patients, k=validation_cases)

Still, there is an error in the program:

[max@base brain-segmentation-pytorch]$ python3 inference.py --images data --weights weights/unet.pt

reading validation images...
Traceback (most recent call last):
  File "inference.py", line 184, in <module>
    main(args)
  File "inference.py", line 22, in main
    loader = data_loader(args)
  File "inference.py", line 76, in data_loader
    dataset = Dataset(
  File "/home/max/OpenSource/brain-segmentation-pytorch/dataset.py", line 57, in __init__
    validation_patients = random.choices(self.patients, k=validation_cases-1)
  File "/usr/lib/python3.8/random.py", line 399, in choices
    return [population[_int(random() * n)] for i in _repeat(None, k)]
  File "/usr/lib/python3.8/random.py", line 399, in <listcomp>
    return [population[_int(random() * n)] for i in _repeat(None, k)]
IndexError: list index out of range

Regards,
Max

Reproducibility issue

Hello, thank you for sharing nice work.
I'm sorry but I can't get the dice score you got.
Is it able to reproduce the evaluation result without changing any code?

dice

Hello, how to evaluate dice after the test? Can you submit an evaluation code?

Retraining on own dataset

Hi,

I retrained your model on my own dataset; However, I see the Dice Loss is 1 because the prediction is totally wrong and the segmented area is really small. During the learning, it does not really change, kind of saturation of DICE. Would you have an idea at first of what is causing that?

Run on the test data

If i need to run and test on single data , how to do with the inference.py ? instead of validation set . i need to run on single image .

A puzzle about the code

Thanks for ur sharing and ur work is really cool!
I have a puzzle about ur code in dataset.py , what is ur point in linë 48 and linë 49?
volumes[patient_id] = np.array(image_slices[1:-1])
masks[patient_id] = np.array(mask_slices[1:-1])

if u want to store the all corresponding data in the dictionary ,why not implement like this, say ,
volumes[patient_id] = np.array(image_slices)
i mean ,if the code is implemented as u do, dosen't it miss the data about image_slice[0] and image_slice[-1]