Giter Site home page Giter Site logo

mrgiovanni / abdomenatlas Goto Github PK

View Code? Open in Web Editor NEW
194.0 6.0 14.0 103.74 MB

[NeurIPS 2023] AbdomenAtlas 1.0 (5,195 CT volumes + 9 annotated classes)

Home Page: https://www.cs.jhu.edu/~alanlab/Pubs23/qu2023abdomenatlas.pdf

License: Other

Python 99.25% Jupyter Notebook 0.75%
dataset multi-organ-segmentation

abdomenatlas's Introduction

AbdonmenAtlas 1.0

We are proud to introduce AbdomenAtlas-8K, a substantial multi-organ dataset with the spleen, liver, kidneys, stomach, gallbladder, pancreas, aorta, and IVC annotated in 8,448 CT volumes, totaling 3.2 million CT slices.

An endeavor of such magnitude would demand a staggering 1,600 weeks or roughly 30.8 years of an experienced annotator's time.

In contrast, our annotation method has accomplished this task in three weeks (premised on an 8-hour workday, five days a week) while maintaining a similar or even better annotation quality.

Data - AbdomenAtlas1.0Mini

Option Image & Mask Image-Only Mask-Only
Huggingface All-in-one Download (3.3GB)
A single compressed file containing all labels.
Zips Download (328GB)
11 compressed files in TAR.GZ format.
Download
11 compressed files in TAR.GZ format.
Folders Download (340GB)
5,195 folders, each containing CT, combined labels, and a segmentations folder of nine masks in NII.GZ format.
Download (325GB)
5,195 folders, each containing CT in NII.GZ format.
Download (15GB)
5,195 folders, each containing combined labels, and a segmentations folder of nine masks in NII.GZ format.
Dropbox All-in-one Download (306GB)
A single compressed file containing all images and labels.
Download (303GB)
A single compressed file containing all images.
Download (3.3GB)
A single compressed file containing all labels.
Zips Download (328GB)
11 compressed files in TAR.GZ format.
Download
11 compressed files in TAR.GZ format.
Baidu Wangpan All-in-one Download (306GB)
A single compressed file containing all images and labels.
Download (303GB)
A single compressed file containing all images.
Download (3.3GB)
A single compressed file containing all labels.
Zips Download (328GB)
11 compressed files in TAR.GZ format.
Download
11 compressed files in TAR.GZ format.

Paper

AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking
Wenxuan Li, Chongyu Qu, Xiaoxi Chen, Pedro R. A. S. Bassi, Yijia Shi, Yuxiang Lai, Qian Yu, Huimin Xue, Yixiong Chen, Xiaorui Lin, Yutong Tang, Yining Cao, Haoqi Han, Zheyuan Zhang, Jiawei Liu, Tiezheng Zhang, Yujiu Ma, Jincheng Wang, Guang Zhang, Alan Yuille, Zongwei Zhou*
Johns Hopkins University
Medical Image Analysis, 2024

AbdomenAtlas-8K: Annotating 8,000 CT Volumes for Multi-Organ Segmentation in Three Weeks
Chongyu Qu1, Tiezheng Zhang1, Hualin Qiao2, Jie Liu3, Yucheng Tang4, Alan L. Yuille1, and Zongwei Zhou1,*
1 Johns Hopkins University,
2 Rutgers University,
3 City University of Hong Kong,
4 NVIDIA
NeurIPS 2023
paper | code | dataset | poster

AbdomenAtlas-8K: Human-in-the-Loop Annotating Eight Anatomical Structures for 8,448 Three-Dimensional Computed Tomography Volumes in Three Weeks
Chongyu Qu1, Tiezheng Zhang1, Hualin Qiao2, Jie Liu3, Yucheng Tang4, Alan L. Yuille1, and Zongwei Zhou1,*
1 Johns Hopkins University,
2 Rutgers University,
3 City University of Hong Kong,
4 NVIDIA
RSNA 2023 (Oral Presentation)
paper | code | slides

★ An improved version, AbdomenAtlas 1.1, can be found at SuPreM GitHub stars.

★ Touchstone - Let's benchmark! GitHub stars

★ We have maintained a document for Frequently Asked Questions.

0. Installation

git clone https://github.com/MrGiovanni/AbdomenAtlas

See installation instructions to create an environment and obtain requirements.

1. Download AI models

We offer pre-trained checkpoints of Swin UNETR and U-Net. The models were trained on a combination of 14 publicly available CT datasets, consisting of 3,410 (see details in CLIP-Driven Universal Model). Download the trained models and save them into ./pretrained_checkpoints/.

Architecture Param Download
U-Net 19.08M link
Swin UNETR 62.19M link

2. Prepare your datasets

It can be publicly available datasets (e.g., BTCV) or your private datasets. Currently, we only take data formatted in nii.gz. This repository will help you assign annotations to these datasets, including 25 organs and six types of tumors (where the annotation of eight organs is pretty accurate).

2.1 Download

Taking the BTCV dataset as an example, download this dataset and save it to the datapath directory.

cd $datapath
wget https://www.dropbox.com/s/jnv74utwh99ikus/01_Multi-Atlas_Labeling.tar.gz
tar -xzvf 01_Multi-Atlas_Labeling.tar.gz
2.2 Preprocessing

Generate a list for this dataset.

cd AbdomenAtlas/
python -W ignore generate_datalist.py --data_path $datapath --dataset_name $dataname --folder img --out ./dataset/dataset_list --save_file $dataname.txt

3. Generate masks

U-Net
CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume pretrained_checkpoints/unet.pth --backbone unet --save_dir $savepath --dataset_list $dataname --data_root_path $datapath --store_result >> logs/$dataname.unet.txt
Swin UNETR
CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume pretrained_checkpoints/swinunetr.pth --backbone swinunetr --save_dir $savepath --dataset_list $dataname --data_root_path $datapath --store_result >> logs/$dataname.swinunetr.txt

To generate attention maps for the active learning process (Step 5 [optional]), remember to save entropy and soft predictions by using the options --store_entropy and --store_soft_pred

4. Data Assembly

In the assembly process, our utmost priority is given to the original annotations supplied by each public dataset. Subsequently, we assign secondary priority to the revised labels from our annotators. The pseudo labels, generated by AI models, are accorded the lowest priority. The following code can implement this priority into the assembled dataset.

python -W ignore assemble.py --data_path $savepath --dataset_name $dataname --backbone swinunetr --save_dir SAVE_DIR --version V1

This is how our AbdonmenAtlas-8K appears

    $savepath/
    ├── $dataname_img0001
    ├── $dataname_img0002
    ├── $dataname_img0003
        │── ct.nii.gz
        ├── original_label.nii.gz
        ├── pseudo_label.nii.gz
        └── segmentations
            ├── spleen.nii.gz
            ├── liver.nii.gz
            ├── pancreas.nii.gz

5. [Optional] Active Learning

If you want to perform the active learning process, you will need the following active learning instructions to generate the attention map for human annotators.

Figure. Illustration of an attention map.

TODO

  • Release pre-trained AI model checkpoints (U-Net and Swin UNETR)
  • Release the AbdomenAtlas-8K dataset (we commit to releasing 3,410 of the 8,448 CT volumes)
  • Support more data formats (e.g., dicom)

Citation

@article{li2024abdomenatlas,
  title={AbdomenAtlas: A large-scale, detailed-annotated, \& multi-center dataset for efficient transfer learning and open algorithmic benchmarking},
  author={Li, Wenxuan and Qu, Chongyu and Chen, Xiaoxi and Bassi, Pedro RAS and Shi, Yijia and Lai, Yuxiang and Yu, Qian and Xue, Huimin and Chen, Yixiong and Lin, Xiaorui and others},
  journal={Medical Image Analysis},
  pages={103285},
  year={2024},
  publisher={Elsevier},
  url={https://github.com/MrGiovanni/AbdomenAtlas}
}

@article{qu2023abdomenatlas,
  title={Abdomenatlas-8k: Annotating 8,000 CT volumes for multi-organ segmentation in three weeks},
  author={Qu, Chongyu and Zhang, Tiezheng and Qiao, Hualin and Tang, Yucheng and Yuille, Alan L and Zhou, Zongwei and others},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2023}
}

@inproceedings{li2024well,
  title={How Well Do Supervised Models Transfer to 3D Image Segmentation?},
  author={Li, Wenxuan and Yuille, Alan and Zhou, Zongwei},
  booktitle={The Twelfth International Conference on Learning Representations},
  year={2024}
}

Acknowledgements

This work was supported by the Lustgarten Foundation for Pancreatic Cancer Research and partially by the Patrick J. McGovern Foundation Award. We appreciate the effort of the MONAI Team to provide open-source code for the community.

abdomenatlas's People

Contributors

aaekay avatar chongyu1117 avatar mrgiovanni avatar ollie-ztz avatar wenxuanchelsea avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

abdomenatlas's Issues

Composition of the dataset

Hi ! Thank you for your awesome work. Could you please to disclose the composition of the dataset, as it intersects with many public datasets and may require consideration of the source of the data when using it

Errow while loading dictionaries from swinunetr

I am getting the following error while checkpoint loading

RuntimeError: Error(s) in loading state_dict for Universal_model:
Unexpected key(s) in state_dict: "swinViT.patch_embed.proj.weight", "swinViT.patch_embed.proj.bias", "swinViT.layers1.0.blocks.0.norm1.weight", "swinViT.layers1.0.blocks.0.norm1.bias", "swinViT.layers1.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers1.0.blocks.0.attn.relative_position_index", "swinViT.layers1.0.blocks.0.attn.qkv.weight", "swinViT.layers1.0.blocks.0.attn.qkv.bias", "swinViT.layers1.0.blocks.0.attn.proj.weight", "swinViT.layers1.0.blocks.0.attn.proj.bias", "swinViT.layers1.0.blocks.0.norm2.weight", "swinViT.layers1.0.blocks.0.norm2.bias", "swinViT.layers1.0.blocks.0.mlp.linear1.weight", "swinViT.layers1.0.blocks.0.mlp.linear1.bias", "swinViT.layers1.0.blocks.0.mlp.linear2.weight", "swinViT.layers1.0.blocks.0.mlp.linear2.bias", "swinViT.layers1.0.blocks.1.norm1.weight", "swinViT.layers1.0.blocks.1.norm1.bias", "swinViT.layers1.0.blocks.1.attn.relative_position_bias_table", "swinViT.layers1.0.blocks.1.attn.relative_position_index", "swinViT.layers1.0.blocks.1.attn.qkv.weight", "swinViT.layers1.0.blocks.1.attn.qkv.bias", "swinViT.layers1.0.blocks.1.attn.proj.weight", "swinViT.layers1.0.blocks.1.attn.proj.bias", "swinViT.layers1.0.blocks.1.norm2.weight", "swinViT.layers1.0.blocks.1.norm2.bias", "swinViT.layers1.0.blocks.1.mlp.linear1.weight", "swinViT.layers1.0.blocks.1.mlp.linear1.bias", "swinViT.layers1.0.blocks.1.mlp.linear2.weight", "swinViT.layers1.0.blocks.1.mlp.linear2.bias", "swinViT.layers1.0.downsample.reduction.weight", "swinViT.layers1.0.downsample.norm.weight", "swinViT.layers1.0.downsample.norm.bias", "swinViT.layers2.0.blocks.0.norm1.weight", "swinViT.layers2.0.blocks.0.norm1.bias", "swinViT.layers2.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers2.0.blocks.0.attn.relative_position_index", "swinViT.layers2.0.blocks.0.attn.qkv.weight", "swinViT.layers2.0.blocks.0.attn.qkv.bias", "swinViT.layers2.0.blocks.0.attn.proj.weight", "swinViT.layers2.0.blocks.0.attn.proj.bias", "swinViT.layers2.0.blocks.0.norm2.weight", "swinViT.layers2.0.blocks.0.norm2.bias", "swinViT.layers2.0.blocks.0.mlp.linear1.weight", "swinViT.layers2.0.blocks.0.mlp.linear1.bias", "swinViT.layers2.0.blocks.0.mlp.linear2.weight", "swinViT.layers2.0.blocks.0.mlp.linear2.bias", "swinViT.layers2.0.blocks.1.norm1.weight", "swinViT.layers2.0.blocks.1.norm1.bias", "swinViT.layers2.0.blocks.1.attn.relative_position_bias_table", "swinViT.layers2.0.blocks.1.attn.relative_position_index", "swinViT.layers2.0.blocks.1.attn.qkv.weight", "swinViT.layers2.0.blocks.1.attn.qkv.bias", "swinViT.layers2.0.blocks.1.attn.proj.weight", "swinViT.layers2.0.blocks.1.attn.proj.bias", "swinViT.layers2.0.blocks.1.norm2.weight", "swinViT.layers2.0.blocks.1.norm2.bias", "swinViT.layers2.0.blocks.1.mlp.linear1.weight", "swinViT.layers2.0.blocks.1.mlp.linear1.bias", "swinViT.layers2.0.blocks.1.mlp.linear2.weight", "swinViT.layers2.0.blocks.1.mlp.linear2.bias", "swinViT.layers2.0.downsample.reduction.weight", "swinViT.layers2.0.downsample.norm.weight", "swinViT.layers2.0.downsample.norm.bias", "swinViT.layers3.0.blocks.0.norm1.weight", "swinViT.layers3.0.blocks.0.norm1.bias", "swinViT.layers3.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers3.0.blocks.0.attn.relative_position_index", "swinViT.layers3.0.blocks.0.attn.qkv.weight", "swinViT.layers3.0.blocks.0.attn.qkv.bias", "swinViT.layers3.0.blocks.0.attn.proj.weight", "swinViT.layers3.0.blocks.0.attn.proj.bias", "swinViT.layers3.0.blocks.0.norm2.weight", "swinViT.layers3.0.blocks.0.norm2.bias", "swinViT.layers3.0.blocks.0.mlp.linear1.weight", "swinViT.layers3.0.blocks.0.mlp.linear1.bias", "swinViT.layers3.0.blocks.0.mlp.linear2.weight", "swinViT.layers3.0.blocks.0.mlp.linear2.bias", "swinViT.layers3.0.blocks.1.norm1.weight", "swinViT.layers3.0.blocks.1.norm1.bias", "swinViT.layers3.0.blocks.1.attn.relative_position_bias_table", "swinViT.layers3.0.blocks.1.attn.relative_position_index", "swinViT.layers3.0.blocks.1.attn.qkv.weight", "swinViT.layers3.0.blocks.1.attn.qkv.bias", "swinViT.layers3.0.blocks.1.attn.proj.weight", "swinViT.layers3.0.blocks.1.attn.proj.bias", "swinViT.layers3.0.blocks.1.norm2.weight", "swinViT.layers3.0.blocks.1.norm2.bias", "swinViT.layers3.0.blocks.1.mlp.linear1.weight", "swinViT.layers3.0.blocks.1.mlp.linear1.bias", "swinViT.layers3.0.blocks.1.mlp.linear2.weight", "swinViT.layers3.0.blocks.1.mlp.linear2.bias", "swinViT.layers3.0.downsample.reduction.weight", "swinViT.layers3.0.downsample.norm.weight", "swinViT.layers3.0.downsample.norm.bias", "swinViT.layers4.0.blocks.0.norm1.weight", "swinViT.layers4.0.blocks.0.norm1.bias", "swinViT.layers4.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers4.0.blocks.0.attn.relative_position_index", "swinViT.layers4.0.blocks.0.attn.qkv.weight", "swinViT.layers4.0.blocks.0.attn.qkv.bias", "swinViT.layers4.0.blocks.0.attn.proj.weight", "swinViT.layers4.0.blocks.0.attn.proj.bias", "swinViT.layers4.0.blocks.0.norm2.weight", "swinViT.layers4.0.blocks.0.norm2.bias", "swinViT.layers4.0.blocks.0.mlp.linear1.weight", "swinViT.layers4.0.blocks.0.mlp.linear1.bias", "swinViT.layers4.0.blocks.0.mlp.linear2.weight", "swinViT.layers4.0.blocks.0.mlp.linear2.bias", "swinViT.layers4.0.blocks.1.norm1.weight", "swinViT.layers4.0.blocks.1.norm1.bias", "swinViT.layers4.0.blocks.1.attn.relative_position_bias_table", "swinViT.layers4.0.blocks.1.attn.relative_position_index", "swinViT.layers4.0.blocks.1.attn.qkv.weight", "swinViT.layers4.0.blocks.1.attn.qkv.bias", "swinViT.layers4.0.blocks.1.attn.proj.weight", "swinViT.layers4.0.blocks.1.attn.proj.bias", "swinViT.layers4.0.blocks.1.norm2.weight", "swinViT.layers4.0.blocks.1.norm2.bias", "swinViT.layers4.0.blocks.1.mlp.linear1.weight", "swinViT.layers4.0.blocks.1.mlp.linear1.bias", "swinViT.layers4.0.blocks.1.mlp.linear2.weight", "swinViT.layers4.0.blocks.1.mlp.linear2.bias", "swinViT.layers4.0.downsample.reduction.weight", "swinViT.layers4.0.downsample.norm.weight", "swinViT.layers4.0.downsample.norm.bias", "encoder1.layer.conv1.conv.weight", "encoder1.layer.conv2.conv.weight", "encoder1.layer.conv3.conv.weight", "encoder2.layer.conv1.conv.weight", "encoder2.layer.conv2.conv.weight", "encoder3.layer.conv1.conv.weight", "encoder3.layer.conv2.conv.weight", "encoder4.layer.conv1.conv.weight", "encoder4.layer.conv2.conv.weight", "encoder10.layer.conv1.conv.weight", "encoder10.layer.conv2.conv.weight", "decoder5.transp_conv.conv.weight", "decoder5.conv_block.conv1.conv.weight", "decoder5.conv_block.conv2.conv.weight", "decoder5.conv_block.conv3.conv.weight", "decoder4.transp_conv.conv.weight", "decoder4.conv_block.conv1.conv.weight", "decoder4.conv_block.conv2.conv.weight", "decoder4.conv_block.conv3.conv.weight", "decoder3.transp_conv.conv.weight", "decoder3.conv_block.conv1.conv.weight", "decoder3.conv_block.conv2.conv.weight", "decoder3.conv_block.conv3.conv.weight", "decoder2.transp_conv.conv.weight", "decoder2.conv_block.conv1.conv.weight", "decoder2.conv_block.conv2.conv.weight", "decoder2.conv_block.conv3.conv.weight", "decoder1.transp_conv.conv.weight", "decoder1.conv_block.conv1.conv.weight", "decoder1.conv_block.conv2.conv.weight", "decoder1.conv_block.conv3.conv.weight".

About nnUnet pre-trained model

Hi, Thanks for sharing! I noticed that you provided two pre-trained models for download. So regarding nnUnet, will pre-training models be provided? I saw in your article that you used three models including nnUnet.

Dataset Release

Are you going to release the datasets or do we have to run the code to obtain the annotations ? I am a little confused. You have mentioned in FAQ that ~5000 volume annotation will be release but i don't see any link where i can download.

Dataset

Nice work. When will you release the dataset?

The shape of the generated label does not match the original image.

I encountered a minor issue when running your code: the shape of the generated label does not match the original image. Here are the details:
I used the epoch_450.pth file provided by you (seems like a beta version :>), and tested it on a private dataset.

The shape of my original image is [64, 368, 576], while the shape of the generated label is [127, 171, 267].
They appear like this in 3D Slicer:
image

I'm wondering if this is due to my operation or if it's a bug?


In addition, regarding lines 233-241 in test.py:

    #Load pre-trained weights
    store_dict = model.state_dict()
    checkpoint = torch.load(args.resume)
    load_dict = checkpoint['net']
    # args.epoch = checkpoint['epoch']

    for key, value in load_dict.items():
        name = '.'.join(key.split('.')[1:])
        store_dict[name] = value

The key in store_dict will have an additional 'backbone.' ,while the key in load_dict seem to have an additional 'module.' due to the use of nn.DataParallel during training and do not have a 'backbone.'.

In this case, the above code may not work properly.

You can directly use:

    #Load pre-trained weights
    store_dict = model.state_dict()
    store_dict_keys = [key for key, value in store_dict.items()]
    checkpoint = torch.load(args.resume)
    load_dict = checkpoint['net']
    load_dict_value = [value for key, value in load_dict.items()]
    # args.epoch = checkpoint['epoch']

    for i in range(len(store_dict)):
        store_dict[store_dict_keys[i]] = load_dict_value[i]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.