frankkramer-lab / covid19.miscnn Goto Github PK

Robust Chest CT Image Segmentation of COVID-19 Lung Infection based on limited data

License: GNU General Public License v3.0

Python 85.25% R 14.75%

covid-19 medical-image-analysis covid-19-ct segmentation pneumonia lung-disease infection 3d-unet lung-segmentation medical-imaging

covid19.miscnn's Introduction

Robust Chest CT Image Segmentation of COVID-19 Lung Infection based on limited data

In this paper, we proposed and evaluated an approach for automated segmentation of COVID-19 infected regions in CT volumes. Our method focused on on-the-fly generation of unique and random image patches for training by exploiting heavy preprocessing and extensive data augmentation. Thus, it is possible to handle limited dataset sizes which act as variant database. Instead of new and complex neural network architectures, we utilized the standard 3D U-Net. We proved that our medical image segmentation pipeline is able to successfully train accurate as well as robust models without overfitting on limited data. Furthermore, we were able to outperform current state-of-the-art semantic segmentation approaches for lungs and COVID-19 infection. Our work has great potential to be applied as a clinical decision support system for COVID-19 quantitative assessment and disease monitoring in the clinical environment. Nevertheless, further research is needed on COVID-19 semantic segmentation in clinical studies for evaluating clinical performance and robustness.

The models, predictions, visualizations and evaluation (scores, figures) are available under the following link: https://doi.org/10.5281/zenodo.3902293

This work does NOT claim clinical performance in any means and underlie purely educational purposes.

Reproducibility

Requirements:

Ubuntu 18.04
Python 3.6
NVIDIA QUADRO RTX 6000 or a GPU with equivalent performance

Step-by-Step workflow:

Download the code repository via git clone to your disk. Afterwards, install all required dependencies, download the dataset and setup the file structure.

git clone https://github.com/muellerdo/covid19.MIScnn.git
cd covid19.MIScnn/

pip3 install -r requirements.txt
python3 scripts/download_data.py

Optionally, you can run the data exploration, which give some interesting information about the dataset.

python3 scripts/data_exploration.py

For the training and inference process, you initialize the cross-validation folds by running the preprocessing. This setups a validation file structure and randomly samples the folds.

The most important step is running the training & inference process for each fold. This can be done either sequential or parallized on multiple GPUs.

python3 scripts/run_preprocessing.py
python3 scripts/run_miscnn.py --fold 0
python3 scripts/run_miscnn.py --fold 1
python3 scripts/run_miscnn.py --fold 2
python3 scripts/run_miscnn.py --fold 3
python3 scripts/run_miscnn.py --fold 4

Finally, the evaluation script computes all scores, visualizations and figures.

python3 scripts/run_evaluation.py

Materials / Dataset

We used the public dataset from Ma et al. which consists of 20 annotated COVID-19 chest CT volumes⁠. Currently, this dataset is the only publicly available 3D volume set with annotated COVID-19 infection segmentation⁠. Each CT volume was first labeled by junior annotators, then refined by two radiologists with 5 years of experience and afterwards the annotations verified by senior radiologists with more than 10 years of experience⁠. The CT images were labeled into four classes: Background, lung left, lung right and COVID-19 infection.

Reference: https://zenodo.org/record/3757476#.XqhRp_lS-5D

Methods

The implemented medical image segmentation pipeline can be summarized in the following core steps:

Dataset: 20x COVID-19 CT volumes
Limited dataset → Utilization as variation database
Heavy preprocessing methods
Extensive data augmentation
Patchwise analysis of high-resolution images
Utilization of the standard 3D U-Net
Model fitting based on Tversky index & cross-entropy
Model predictions on overlapping patches
5-fold cross-validation via Dice similarity coefficient

This pipeline was based on MIScnn⁠, which is an in-house developed open-source framework to setup complete medical image segmentation pipelines with convolutional neural networks and deep learning models on top of Tensorflow/Keras⁠. The framework supports extensive preprocessing, data augmentation, state-of-the-art deep learning models and diverse evaluation techniques. The experiment was performed on a Nvidia Quadro P6000.

MIScnn: https://github.com/frankkramer-lab/MIScnn

Results & Discussion

Through validation monitoring during the training, no overfitting was observed. The training and validation loss function revealed no significant distinction from each other. During the fitting, the performance settled down at a loss of around 0.383 which is a generalized DSC (average of all class-wise DSCs) of around 0.919. Because of this robust training process without any signs of overfitting, we concluded that fitting on randomly generated patches via extensive data augmentation and random cropping from a variant database, is highly efficient for limited imaging data.

The inference revealed a strong segmentation performance for lungs, as well as, COVID-19 infected regions. Overall, the cross-validation models achieved a DSC of around 0.956 for lung and 0.761 for COVID-19 infection segmentation.
Furthermore, the models achieved a sensitivity and specificity of 0.956 and 0.998 for lungs, as well as, 0.730 and 0.999 for infection, respectively.

Nevertheless, our medical image segmentation pipeline allowed fitting a model which is able to segment COVID-19 infection with state-of-the-art accuracy that is comparable to models trained on large datasets.

Author

Dominik Müller
Email: [email protected]
IT-Infrastructure for Translational Medical Research
University Augsburg
Bavaria, Germany

How to cite / More information

Dominik Müller, Iñaki Soto-Rey and Frank Kramer.
Robust chest CT image segmentation of COVID-19 lung infection based on limited data.
Informatics in Medicine Unlocked. Volume 25, 2021.
DOI: https://doi.org/10.1016/j.imu.2021.100681

@article{MULLER2021100681,
title = {Robust chest CT image segmentation of COVID-19 lung infection based on limited data},
journal = {Informatics in Medicine Unlocked},
volume = {25},
pages = {100681},
year = {2021},
issn = {2352-9148},
doi = {https://doi.org/10.1016/j.imu.2021.100681},
url = {https://www.sciencedirect.com/science/article/pii/S2352914821001660},
author = {Dominik Müller and Iñaki Soto-Rey and Frank Kramer},
keywords = {COVID-19, Segmentation, Limited data, Computed tomography, Deep learning, Artificial intelligence},
eprint={2007.04774},
archivePrefix={arXiv},
primaryClass={eess.IV}
}

Thank you for citing our work.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE Version 3.
See the LICENSE.md file for license rights and limitations.

covid19.miscnn's People

Contributors

Stargazers

Watchers

covid19.miscnn's Issues

"Iterations" param in run_fold

From what I understand, the "iterations" parameter is the steps per epoch. In your script, you fixed that as 150. During my training, I tried to leave that as default, hence None, hoping that it will run through all the created patches. However, it showed 10 as the steps per epoch during the training. Does this mean the network only trains on 10*batch_size(=2) = 20 patches per epoch? This is way too few.

Do I misunderstand something here? Thanks a lot for your help!

Score calculation in run_evaluation.py

To-do:

Calculate Dice Similarity Coefficient, Sensitivity and Specificity for each class.

Average (mean) Lung left and Lung Right to class "Lungs"

Time

Hello, I would like to ask you, what kind of running program do you use? I ran on the server and trained 5 times for 3 days. The speed is too slow, and you trained 1000 times, so can you tell me what you used to run the program method.

About running errors

Hello, I'm sorry to disturb you, but I encountered some problems while running the program you wrote. I hope you can answer.
In the process of running run_miscnn.py, it always shows that there is no imaging.nii.gz file. I have copied it from other sources. If it exists, it still reports this error. So I want to ask, did you pass other files? We have processed the data and look forward to your reply. Greatful.

Get in email contact with authors of COVID-19 benchmark data set

https://gitee.com/junma11/COVID-19-CT-Seg-Benchmark

Some kind of rank list for more transparency?

Organization of challenge?
-> more data required

Add Mac M1 Apple Silicon Support

Issue:
arm64 architecture needs tensorflow-macos, which is currently not recognized by aucmedi
Solution:
aucmedi-macos that depends on tensorflow-macos

change license

patch shape

Hi,

I have a question regarding the patch shape in the script run_miscnn.py. It is set as (160, 160, 80). Is the z-axis, 80 represents the number of slices in the patch? However, in your source code of Miscnn, you wrote:
patch_shape (integer tuple): Size and shape of a patch. The variable has to be defined as a tuple.
For Example: (64,128,128) for 64x128x128 patch cubes.
Be aware that the x-axis represents the number of slices in 3D volumes.

Can you clarify this for me?

Best wishes,
LHJ

Getting Error in run_miscnn

File "scripts/run_miscnn.py", line 120, in
save_models=False)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/miscnn/evaluation/cross_validation.py", line 166, in run_fold
iterations=iterations, callbacks=cb_list)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/miscnn/neural_network/model.py", line 201, in evaluate
max_queue_size=self.batch_queue_size)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
tmp_logs = train_function(iterator)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 780, in call
result = self._call(*args, **kwds)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2829, in call
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
cancellation_manager=cancellation_manager)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 550, in call
ctx=ctx)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node functional_1/conv3d/Conv3D (defined at /home/ubuntu/anaconda3/lib/python3.7/site-packages/miscnn/neural_network/model.py:201) ]] [Op:__inference_train_function_9673]

Function call stack:
train_function

2020-12-07 06:04:13.638644: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated

Patchwise Overlap

Hi,

I've noticed something that Patchwise Overlap is 80x80x40 in the paper but it is 80x80x30 in the code. If overlap shape is not half of patch shape , is that okey ?

Multi-GPU or how to choose a particular GPU

Hi,

I have a confusion in the use of the GPUs. the function Neural_Network use le variable gpu_number (version 0.34), but if the GPU=0 is busy and I get an error. The last version of miscnn use the variable multi_gpu=boolean.

Could you explain to me how to set the use of a particular gpu.

Thank

Incompatible numpy & tensorflow & miscnn

The contradictory required version of numpy & miscnn, numpy & tensorflow made it impossible to run the code. Furthermore, NIFTI_interface is not defined. I managed to get into the source code but can't debug it due to lack of software coding knowledge.

How to Make Predictions of a set of new images with the trained model and save it as nii file? What is the procedure?

Citation

Dear @muellerdo

Thanks for using our dataset. It would be highly appreciated if you could add following citation in your paper.

@article{COVID-19-SegBenchmark,
  title={Towards Efficient COVID-19 CT Annotation: A Benchmark for Lung and Infection Segmentation},
  author={Ma Jun and Wang Yixin and An Xingle and Ge Cheng and Yu Ziqi and Chen Jianan and Zhu Qiongjie and Dong Guoqiang and He Jian and He Zhiqiang and Ni Ziwei and Yang Xiaoping},
  journal={arXiv preprint arXiv:2004.12537},
  year={2020}
}

Best regards,
Jun

Dataset HU range

Hi @frankkramer

When I check the dataset, There are 2 part coranacases and radiopedia. For the radiopedia part the images set up 0-255. But for the coronacases part HU range -1250 to 250. I wonder how do you overcome this problem.

Visualizer function in run_evaluation.py

To-do:
Implement GIF animation of 3D volume via matplotlib

Data Set

Reference: https://zenodo.org/record/3757476#.XqhRp_lS-5D

Slicing Method

Sir, firstly it is great repository and I want to say thank you.
Could you give more information about how did you slice 3D images into patches (I believe patch means slices of 2D image which is extracted from 3D images, am I wrong)?

Add improved CV functionality to MIScnn

Add 2 functions to CV of MIScnn:

Split dataset into folds & save them to file
Run CV for fold X based on stored file

load_csv2fold

Hi @muellerdo

I am using your MIScnn framework from installing directly from Github. I am not using (pip installation). But in cross validation file I can't file "load_csv2fold". So in run_miscnn.py line: 32 give error.

Prediction could not be found "predictions/coronacases_001.nii.gz"

Hello!

I would like to know what's wrong when I was going to run the run_evaluation.py. It said that "Prediction could not be found "predictions/coronacases_001.nii.gz""

Thanks!

Error while executing download data script

Executing scripts/download_data.py gives following error

$ python3 scripts/download_data.py

INFO: Downloading Volumes
COVID-19-CT-Seg_20cases.zip?download=1:   0%|                   | 1.20k/1.11G [00:00<225:54:25, 1.36kB/s]
INFO: Downloading Segmentations
Lung_and_Infection_Mask.zip?download=1:   0%|                     | 1.20k/11.7M [00:00<2:24:22, 1.35kB/s]
INFO: Obtain sample list from the volumes ZIP file
Traceback (most recent call last):
  File "scripts/download_data.py", line 85, in <module>
    with zipfile.ZipFile(path_vol_zip, "r") as zip_vol:
  File "/home/cinnamon/Repositories/miniconda3/envs/covid/lib/python3.6/zipfile.py", line 1131, in __init__
    self._RealGetContents()
  File "/home/cinnamon/Repositories/miniconda3/envs/covid/lib/python3.6/zipfile.py", line 1198, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

How to overcome this issue

Interesting Journals

IEEE Transactions on Medical Imaging:
Special Issue on "Imaging-based Diagnosis of COVID-19"

IEEE Journal of Biomedical and Health Informatics:
Special Issue on "AI-driven Informatics, Sensing, Imaging and Big Data Analytics for Fighting the COVID-19 Pandemic"

Medical Image Analysis:
Special Issue on "Intelligent Analysis of COVID-19 Imaging Data"

Cannot import split module

Hey guys,
Great work. I am trying to reproduce the results from your paper.
Here is the error I am getting


2020-07-22 15:59:09.423261: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
  File "scripts/run_preprocessing.py", line 25, in <module>
    from miscnn.evaluation.cross_validation import split_folds
ImportError: cannot import name 'split_folds

Please have a look and suggest any fix

Thanks

frankkramer-lab / covid19.miscnn Goto Github PK

covid19.miscnn's Introduction

Robust Chest CT Image Segmentation of COVID-19 Lung Infection based on limited data

Reproducibility

Materials / Dataset

Methods

Results & Discussion

Author

How to cite / More information

License

covid19.miscnn's People

Contributors

Stargazers

Watchers

Forkers

covid19.miscnn's Issues

Recommend Projects

Recommend Topics

Recommend Org