rmaphoh / retfound_mae Goto Github PK

View Code? Open in Web Editor NEW

300.0 300.0 56.0 401 KB

RETFound - A foundation model for retinal image

License: Other

Python 14.11% Jupyter Notebook 85.89%

retfound_mae's People

Contributors

Stargazers

Watchers

Forkers

liutiming ismailm davidekuo xychen2022 fage452 epeters3 gronteix defensetongxue oli4 janalexandermantis zyleeyang zeroonegame xiyuy twebberbr muyunchengbi randombenj seri-epi-ds xxchenxx neugmd rschererstm rbpilgrim rvimlab knectt zoule41 unfogetable rudrho ahmad-omar-ahsan humbleduty tommylitlle zhanghh233 lulidreamai vkadlec annamattiroli nirvanesque beswift jin0008 mariaboch ctaylor84 drandresinzunza esafwan haotianhuang zhutony ot710 davinciwu sergiolucero chiache726 alinadevkota damaggu monkeygobah sebasmos atharvrn mofo50c njiang2024 charlie-ck-y cambridge-vision-technology hyeok-jong

retfound_mae's Issues

fine-tuning weights

Hi~Thank you very much for your meaningful work.
i don't want to load the pre-training weights, but rather train the model directly from scratch.
how should i to do in the code?(●'◡'●)

Can You Provide a Salient Region Visualization Demo?

Hi, this is an excellent piece of work. The paper demonstrates using RELPROP to visualize salient regions. I had issues with the visualization effect during my reproduction attempt and am unsure where the problem lies. Could you provide a demo?

Can you provide the checkpoint of systemic diseases prediction

Can you release your pre-trained best checkpoints of systemic diseases prediction? We want to test it on our own datasets.
Thank you

Unable to get the model

I couldn't find this file. ./RETFound_cfp.pth', and would like to know how to download the checkpoint.

while evaluation I am getting this type of error

(head): Linear(in_features=1024, out_features=5, bias=True)
(fc_norm): LayerNorm((1024,), eps=1e-06, elementwise_affine=True)
)
[23:45:32.000215] number of params (M): 303.31
[23:45:32.000232] base lr: 1.00e-03
[23:45:32.000243] actual lr: 2.50e-04
[23:45:32.000252] accumulate grad iterations: 1
[23:45:32.000260] effective batch size: 64
[23:45:32.003118] criterion = LabelSmoothingCrossEntropy()
Traceback (most recent call last):
File "main_finetune.py", line 377, in
main(args)
File "main_finetune.py", line 315, in main
misc.load_model(args=args, model_without_ddp=model_without_ddp, optimizer=optimizer, loss_scaler=loss_scaler)
File "/home/ak/Desktop/test/RETFound_MAE/util/misc.py", line 316, in load_model
model_without_ddp.load_state_dict(checkpoint['model'])
File "/home/ak/anaconda3/envs/retfound/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1224, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VisionTransformer:
Missing key(s) in state_dict: "head.weight", "head.bias", "fc_norm.weight", "fc_norm.bias".
Unexpected key(s) in state_dict: "mask_token", "decoder_pos_embed", "norm.weight", "norm.bias", "decoder_embed.weight", "decoder_embed.bias", "decoder_blocks.0.norm1.weight", "decoder_blocks.0.norm1.bias", "decoder_blocks.0.attn.qkv.weight", "decoder_blocks.0.attn.qkv.bias", "decoder_blocks.0.attn.proj.weight", "decoder_blocks.0.attn.proj.bias", "decoder_blocks.0.norm2.weight", "decoder_blocks.0.norm2.bias", "decoder_blocks.0.mlp.fc1.weight", "decoder_blocks.0.mlp.fc1.bias", "decoder_blocks.0.mlp.fc2.weight", "decoder_blocks.0.mlp.fc2.bias", "decoder_blocks.1.norm1.weight", "decoder_blocks.1.norm1.bias", "decoder_blocks.1.attn.qkv.weight", "decoder_blocks.1.attn.qkv.bias", "decoder_blocks.1.attn.proj.weight", "decoder_blocks.1.attn.proj.bias", "decoder_blocks.1.norm2.weight", "decoder_blocks.1.norm2.bias", "decoder_blocks.1.mlp.fc1.weight", "decoder_blocks.1.mlp.fc1.bias", "decoder_blocks.1.mlp.fc2.weight", "decoder_blocks.1.mlp.fc2.bias", "decoder_blocks.2.norm1.weight", "decoder_blocks.2.norm1.bias", "decoder_blocks.2.attn.qkv.weight", "decoder_blocks.2.attn.qkv.bias", "decoder_blocks.2.attn.proj.weight", "decoder_blocks.2.attn.proj.bias", "decoder_blocks.2.norm2.weight", "decoder_blocks.2.norm2.bias", "decoder_blocks.2.mlp.fc1.weight", "decoder_blocks.2.mlp.fc1.bias", "decoder_blocks.2.mlp.fc2.weight", "decoder_blocks.2.mlp.fc2.bias", "decoder_blocks.3.norm1.weight", "decoder_blocks.3.norm1.bias", "decoder_blocks.3.attn.qkv.weight", "decoder_blocks.3.attn.qkv.bias", "decoder_blocks.3.attn.proj.weight", "decoder_blocks.3.attn.proj.bias", "decoder_blocks.3.norm2.weight", "decoder_blocks.3.norm2.bias", "decoder_blocks.3.mlp.fc1.weight", "decoder_blocks.3.mlp.fc1.bias", "decoder_blocks.3.mlp.fc2.weight", "decoder_blocks.3.mlp.fc2.bias", "decoder_blocks.4.norm1.weight", "decoder_blocks.4.norm1.bias", "decoder_blocks.4.attn.qkv.weight", "decoder_blocks.4.attn.qkv.bias", "decoder_blocks.4.attn.proj.weight", "decoder_blocks.4.attn.proj.bias", "decoder_blocks.4.norm2.weight", "decoder_blocks.4.norm2.bias", "decoder_blocks.4.mlp.fc1.weight", "decoder_blocks.4.mlp.fc1.bias", "decoder_blocks.4.mlp.fc2.weight", "decoder_blocks.4.mlp.fc2.bias", "decoder_blocks.5.norm1.weight", "decoder_blocks.5.norm1.bias", "decoder_blocks.5.attn.qkv.weight", "decoder_blocks.5.attn.qkv.bias", "decoder_blocks.5.attn.proj.weight", "decoder_blocks.5.attn.proj.bias", "decoder_blocks.5.norm2.weight", "decoder_blocks.5.norm2.bias", "decoder_blocks.5.mlp.fc1.weight", "decoder_blocks.5.mlp.fc1.bias", "decoder_blocks.5.mlp.fc2.weight", "decoder_blocks.5.mlp.fc2.bias", "decoder_blocks.6.norm1.weight", "decoder_blocks.6.norm1.bias", "decoder_blocks.6.attn.qkv.weight", "decoder_blocks.6.attn.qkv.bias", "decoder_blocks.6.attn.proj.weight", "decoder_blocks.6.attn.proj.bias", "decoder_blocks.6.norm2.weight", "decoder_blocks.6.norm2.bias", "decoder_blocks.6.mlp.fc1.weight", "decoder_blocks.6.mlp.fc1.bias", "decoder_blocks.6.mlp.fc2.weight", "decoder_blocks.6.mlp.fc2.bias", "decoder_blocks.7.norm1.weight", "decoder_blocks.7.norm1.bias", "decoder_blocks.7.attn.qkv.weight", "decoder_blocks.7.attn.qkv.bias", "decoder_blocks.7.attn.proj.weight", "decoder_blocks.7.attn.proj.bias", "decoder_blocks.7.norm2.weight", "decoder_blocks.7.norm2.bias", "decoder_blocks.7.mlp.fc1.weight", "decoder_blocks.7.mlp.fc1.bias", "decoder_blocks.7.mlp.fc2.weight", "decoder_blocks.7.mlp.fc2.bias", "decoder_norm.weight", "decoder_norm.bias", "decoder_pred.weight", "decoder_pred.bias".

Input image channel problem

Hello, thank you for providing such a great work. I've noticed that the retinal fundus images used have 3 channels. How should the code be adjusted when using OCT images with only 1 channel as input?

Fail to run your visualize

I try to run the visualize, but with many errors in "Load a pre-trained MAE model" part.

Increasing learning rate

While fine-tune pretrained RETFound_cfp on our dataset, I noticed that the learning rate was increasing every batch. Why? Is there any practice behind this? Thanks

My bash command was:
python -m torch.distributed.launch --nproc_per_node=1 --master_port=48798 main_finetune.py --batch_size 4 --world_size 1 --model vit_large_patch16 --epochs 100 --lr 5e-3 --blr 5e-3 --layer_decay 0.65 --weight_decay 0.05 --drop_path 0.1 --nb_classes 2 --data_path ../../chla_fundus/ --task ./finetune_chla/ --finetune ./RETFound_cfp_weights.pth

How to use this to detect disease?

From the point:
RETFound has been validated in multiple disease detection tasks

How can i use this repo to detect disease?

Requires grad parameters to finetune a downstream classification task

Hello,

First of all thanks for sharing this great framework.

I was wondering which is the best approach to finetune the model from the pretrained weights for a binary classification task. Should we compute all parameters gradients (requires_grad=True)? or should we freeze some of the parts of the model?

Thanks in advance,
Regards,

Splits for OCTID dataset

Thanks for sharing the code/weights of this great work.
I want to compare another model with RETFound's results on OCTID dataset. Could you share the splits used for train/val/test on this dataset, i.e the exact image filenames used per split?

Dino, SwaV, Moco-v3

is there a code for Dino, SwaV, Moco-v3 ?

Thank you.

Finetuning for the downstream tasks

I'm trying to attach a CNN-based classifier/regressor for downstream tasks, to speed up I'm simply taking the output of the Retfound fc_norm layer as my input, which is the final hidden state of the transformers. However I'm not able to obtain legit results for simple classifications like sex or diabetes with CNNs or clustering(UMAP). And I have no clue how to verify if the hidden state output makes sense(maybe attention map visualization?)

finetune with high resolution input

Hi there,

The defualt input size is 224 * 224 for the model.
I'm wondering if I can fine-tune the model with input image size 384 * 384. How do I modifty the code to implement.

Would it be possible to opensource the DINO pretrained weights?

I would really appreciate it if you can open-source the DINO weights, with a contrastive learning-based approach like DINO we can explore unsupervised segmentation.

Pre processing of fine-tuning datasets

Hi~Thank you very much for your meaningful work.

I noticed that you mentioned the preprocessing of CFP in "Data processing and augmentation for SSL".

I would like to know if this preprocessing still occurs during the fine-tuning stage? Can I request pre processing of relevant manuscripts or processed fine-tuning datasets?

Thank you very much for your help!

About a issue after running at terminal

Why i run a code with command :python -m torch.distributed.launch --nproc_per_node=1 --master_port=48798 main_finetune.py --batch_size 16 --world_size 1 --model vit_large_patch16 --epochs 50 --blr 5e-3 --layer_decay 0.65 --weight_decay 0.05 --drop_path 0.2 --nb_classes 5 --data_path ./Task1/ --task ./finetune_IDRiD/ --finetune ./RETFound_cfp_weights.pth --input_size 224
firstly,it prints lots of information including parameter ,model architecture
After that, it raise a error:
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/wuwentao/anaconda3/envs/retfound/bin/python', '-u', 'main_finetune.py', '--local_rank=0', '--batch_size', '16', '--world_size', '1', '--model', 'vit_large_patch16', '--epochs', '50', '--blr', '5e-3', '--layer_decay', '0.65', '--weight_decay', '0.05', '--drop_path', '0.2', '--nb_classes', '5', '--data_path', './Task1/', '--task', './finetune_IDRiD/', '--finetune', './RETFound_cfp_weights.pth', '--input_size', '224']' returned non-zero exit status 1.

How to use multiple GPUs for finetune?

I am a greenhand at coding, wondering how to use multiple GPUs for finetune?
I have 4 available GPUs and have set "--world_size" to 4, but it seems there is only 1 GPU being used.

Is it possible to share the dataset used in pretraining

The Moorfields diabetic image dataset (MEH-MIDAS) and public data (totalling 904,170 CFPs and 736,442 OCTs).

This dataset with 1.6 million samples will have significant impact if it is publicly available, I believe!

Finetuned result and Evaluation result is different

Hello, first of all, thank you so much for sharing this valuable and meaningful research.
The results of evaluation differ when using the ''RETFound_cfp_weights.pth'' file for fine-tuning and evaluating, compared to using the ''checkpoint-best.pth'' file saved during fine-tuning.
Is the ''checkpoint-best.pth'' file not the best parameter saved during the fine-tuning process of ''RETFound_cfp_weights.pth''?

Abt dataset

Thanks for u great job, may i ask a question : how do i get the dataset MEH-MIDAS?

fine-tune in small sample size

Hello~Thank you very much for your meaningful work.
I wanted to reach out and seek your advice regarding an issue I am facing with overfitting in my model. Due to the limited size of my dataset, which consists of only a few hundred examples, I have opted to use transfer learning for modeling. However, after applying fine-tuning using the provided code, I have encountered overfitting with a test accuracy of only around 0.6.

I would greatly appreciate any suggestions or methods you may have to address overfitting in small datasets during the fine-tuning process.

Enquiry for a demo example

Hi, I'm kind of new to this field, is there any possibility that you could provide a demo example using your the OCT/CFP pretrained model and finetuned it on a specific dataset and then use it to predict images of the dataset using colab notebook?

Fine-tune for downstream regression task (continuous labels)

Hi, thanks for your excellent work!

In the paper, you provide examples where the model is fine-tuned for downstream classification tasks (either binary diseases or multi-class labels). Is it possible to use the same framework but to fine-tune on continuous labels, for example predicting systolic blood pressure from fundus images? If so, could you please provide guidelines on how this should be done?