Giter Site home page Giter Site logo

benchmarktransformerssegmentation's Introduction

Extended from Benchmark Transformers for segmentation tasks

python main_segmentation.py --data_set JSRTClavicle \
--data_dir [PATH_TO_DATASET]/JSRT/All247images/ \
--train_list dataset/jsrt/train.txt \
--val_list dataset/jsrt/val.txt \
--test_list dataset/jsrt/test.txt \
--learning_rate 0.05 --epochs 500  --batch_size 32 --patience 50 \
--arch upernet_swin  --init ark \
--pretrained_weights [PATH_TO_MODEL]/ark6_teacher_ep200_swinb_projector1376_mlp.pth.tar 

Benchmarking and Boosting Transformers for Medical Image Classification

We benchmark how well existing transformer variants that use various (supervised and self-supervised) pre-training methods perform against CNNs on a variety of medical classification tasks. Furthermore, given the data-hungry nature of transformers and the annotation-deficiency challenge of medical imaging, we present a practical approach for bridging the domain gap between photographic and medical images by utilizing unlabeled large-scale in-domain data.

FrontCover

Publication

Benchmarking and Boosting Transformers for Medical Image Classification
DongAo Ma1,Mohammad Reza Hosseinzadeh Taher1, Jiaxuan Pang1, Nahid Ul Islam1, Fatemeh Haghighi1, Michael B. Gotway2, Jianming Liang1
1 Arizona State University, 2 Mayo Clinic

International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022); Domain Adaptation and Representation Transfer (DART)

Paper (PDF, Supplementary material) | Code | Poster | Slides | Presentation (YouTube, BiliBili)

Major results from our work

  1. Pre-training is more vital for transformer-based models than for CNNs in medical imaging.

Result1

In medical imaging, good initialization is more vital for transformer-based models than for CNNs. When training from scratch, transformers perform significantly worse than CNNs on all target tasks. However, with supervised or self-supervised pre-training on ImageNet, transformers can offer the same results as CNNs, highlighting the importance of pre-training when using transformers for medical imaging tasks. We conduct statistical analysis between the best of six pre-trained transformer models and the best of three pre-trained CNN models.


  1. Self-supervised learning based on masked image modeling is a preferable option to supervised baselines for medical imaging.

Result2

Self-supervised SimMIM model with the Swin-B backbone outperforms fully- supervised baselines. The best methods are bolded while the second best are underlined. For every target task, we conduct statistical analysis between the best (bolded) vs. others. Green-highlighted boxes indicate no statistically significant difference at the p = 0.05 level.


  1. Self-supervised domain-adaptive pre-training on a larger-scale domain-specific dataset better bridges the domain gap between photographic and medical imaging.

Result3

The domain-adapted pre-trained model which utilized a large number of in-domain data (X-rays(926K)) in an SSL manner achieves the best performance across all five target tasks. The best methods are bolded while the second best are underlined. For each target task, we conducted the independent two sample t-test between the best (bolded) vs. others. The absence of a statistically significant difference at the p = 0.05 level is indicated by green-highlighted boxes.

*X-rays(926K): To check what datasets are used for the domain-adaptive pre-training, please see the Supplementary material.


Requirements

Pre-trained models

You can download the pretrained models used/developed in our paper as follows:

Category Backbone Training Dataset Training Objective model
Domain-adapted models Swin-Base ImageNet → X-rays(926K) SimMIM → SimMIM download
Swin-Base ImageNet → ChestX-ray14 SimMIM → SimMIM download
In-domain models Swin-Base X-rays(926K) SimMIM download
Swin-Base ChestX-ray14 SimMIM download

Fine-tuing of pre-trained models on target task

  1. Download the desired pre-trained model.
  2. Download the desired dataset; you can simply add any other dataset that you wish.
  3. Run the following command by the desired parameters. For example, to finetune our pre-trained ImageNet → X-rays(926K) model on ChestX-ray14, run:
python main_classification.py --data_set ChestXray14  \
--model swin_base \
--init simmim \
--pretrained_weights [PATH_TO_MODEL]/simmim_swinb_ImageNet_Xray926k.pth \
--data_dir [PATH_TO_DATASET] \
--train_list dataset/Xray14_train_official.txt \
--val_list dataset/Xray14_val_official.txt \
--test_list dataset/Xray14_test_official.txt \
--lr 0.01 --opt sgd --epochs 200 --warmup-epochs 0 --batch_size 64

Or, to evaluate the official released ImageNet models from timm on ChestX-ray14, run:

python main_classification.py --data_set ChestXray14  \
--model vit_base \
--init imagenet_21k \
--data_dir [PATH_TO_DATASET] \
--train_list dataset/Xray14_train_official.txt \
--val_list dataset/Xray14_val_official.txt \
--test_list dataset/Xray14_test_official.txt \
--lr 0.1 --opt sgd --epochs 200 --warmup-epochs 20 --batch_size 64

Citation

If you use this code or use our pre-trained weights for your research, please cite our paper:

@inproceedings{Ma2022Benchmarking,
    title="Benchmarking and Boosting Transformers for Medical Image Classification",
    author="Ma, DongAo and Hosseinzadeh Taher, Mohammad Reza and Pang, Jiaxuan and Islam, Nahid UI and Haghighi, Fatemeh and Gotway, Michael B and Liang, Jianming",
    booktitle="Domain Adaptation and Representation Transfer",
    year="2022",
    publisher="Springer Nature Switzerland",
    address="Cham",
    pages="12--22",
    isbn="978-3-031-16852-9"
}

Acknowledgement

This research has been supported in part by ASU and Mayo Clinic through a Seed Grant and an Innovation Grant, and in part by the NIH under Award Number R01HL128785. The content is solely the responsi- bility of the authors and does not necessarily represent the official views of the NIH. This work has utilized the GPUs provided in part by the ASU Research Computing and in part by the Extreme Science and Engineering Discovery En- vironment (XSEDE) funded by the National Science Foundation (NSF) under grant numbers: ACI-1548562, ACI-1928147, and ACI-2005632. The content of this paper is covered by patents pending.

License

Released under the ASU GitHub Project License.

benchmarktransformerssegmentation's People

Contributors

mda233 avatar mdaooo avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.