rmaphoh / retfound_mae Goto Github PK

RETFound - A foundation model for retinal image

License: Other

Python 14.11% Jupyter Notebook 85.89%

retfound_mae's Issues

Increasing learning rate

While fine-tune pretrained RETFound_cfp on our dataset, I noticed that the learning rate was increasing every batch. Why? Is there any practice behind this? Thanks

My bash command was:
python -m torch.distributed.launch --nproc_per_node=1 --master_port=48798 main_finetune.py --batch_size 4 --world_size 1 --model vit_large_patch16 --epochs 100 --lr 5e-3 --blr 5e-3 --layer_decay 0.65 --weight_decay 0.05 --drop_path 0.1 --nb_classes 2 --data_path ../../chla_fundus/ --task ./finetune_chla/ --finetune ./RETFound_cfp_weights.pth

Splits for OCTID dataset

Thanks for sharing the code/weights of this great work.
I want to compare another model with RETFound's results on OCTID dataset. Could you share the splits used for train/val/test on this dataset, i.e the exact image filenames used per split?

Require Dataset mentioned in paper

Great job! how do I get the dataset MEH-MIDAS?

while evaluation I am getting this type of error

(head): Linear(in_features=1024, out_features=5, bias=True)
(fc_norm): LayerNorm((1024,), eps=1e-06, elementwise_affine=True)
)
[23:45:32.000215] number of params (M): 303.31
[23:45:32.000232] base lr: 1.00e-03
[23:45:32.000243] actual lr: 2.50e-04
[23:45:32.000252] accumulate grad iterations: 1
[23:45:32.000260] effective batch size: 64
[23:45:32.003118] criterion = LabelSmoothingCrossEntropy()
Traceback (most recent call last):
File "main_finetune.py", line 377, in
main(args)
File "main_finetune.py", line 315, in main
misc.load_model(args=args, model_without_ddp=model_without_ddp, optimizer=optimizer, loss_scaler=loss_scaler)
File "/home/ak/Desktop/test/RETFound_MAE/util/misc.py", line 316, in load_model
model_without_ddp.load_state_dict(checkpoint['model'])
File "/home/ak/anaconda3/envs/retfound/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1224, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VisionTransformer:
Missing key(s) in state_dict: "head.weight", "head.bias", "fc_norm.weight", "fc_norm.bias".
Unexpected key(s) in state_dict: "mask_token", "decoder_pos_embed", "norm.weight", "norm.bias", "decoder_embed.weight", "decoder_embed.bias", "decoder_blocks.0.norm1.weight", "decoder_blocks.0.norm1.bias", "decoder_blocks.0.attn.qkv.weight", "decoder_blocks.0.attn.qkv.bias", "decoder_blocks.0.attn.proj.weight", "decoder_blocks.0.attn.proj.bias", "decoder_blocks.0.norm2.weight", "decoder_blocks.0.norm2.bias", "decoder_blocks.0.mlp.fc1.weight", "decoder_blocks.0.mlp.fc1.bias", "decoder_blocks.0.mlp.fc2.weight", "decoder_blocks.0.mlp.fc2.bias", "decoder_blocks.1.norm1.weight", "decoder_blocks.1.norm1.bias", "decoder_blocks.1.attn.qkv.weight", "decoder_blocks.1.attn.qkv.bias", "decoder_blocks.1.attn.proj.weight", "decoder_blocks.1.attn.proj.bias", "decoder_blocks.1.norm2.weight", "decoder_blocks.1.norm2.bias", "decoder_blocks.1.mlp.fc1.weight", "decoder_blocks.1.mlp.fc1.bias", "decoder_blocks.1.mlp.fc2.weight", "decoder_blocks.1.mlp.fc2.bias", "decoder_blocks.2.norm1.weight", "decoder_blocks.2.norm1.bias", "decoder_blocks.2.attn.qkv.weight", "decoder_blocks.2.attn.qkv.bias", "decoder_blocks.2.attn.proj.weight", "decoder_blocks.2.attn.proj.bias", "decoder_blocks.2.norm2.weight", "decoder_blocks.2.norm2.bias", "decoder_blocks.2.mlp.fc1.weight", "decoder_blocks.2.mlp.fc1.bias", "decoder_blocks.2.mlp.fc2.weight", "decoder_blocks.2.mlp.fc2.bias", "decoder_blocks.3.norm1.weight", "decoder_blocks.3.norm1.bias", "decoder_blocks.3.attn.qkv.weight", "decoder_blocks.3.attn.qkv.bias", "decoder_blocks.3.attn.proj.weight", "decoder_blocks.3.attn.proj.bias", "decoder_blocks.3.norm2.weight", "decoder_blocks.3.norm2.bias", "decoder_blocks.3.mlp.fc1.weight", "decoder_blocks.3.mlp.fc1.bias", "decoder_blocks.3.mlp.fc2.weight", "decoder_blocks.3.mlp.fc2.bias", "decoder_blocks.4.norm1.weight", "decoder_blocks.4.norm1.bias", "decoder_blocks.4.attn.qkv.weight", "decoder_blocks.4.attn.qkv.bias", "decoder_blocks.4.attn.proj.weight", "decoder_blocks.4.attn.proj.bias", "decoder_blocks.4.norm2.weight", "decoder_blocks.4.norm2.bias", "decoder_blocks.4.mlp.fc1.weight", "decoder_blocks.4.mlp.fc1.bias", "decoder_blocks.4.mlp.fc2.weight", "decoder_blocks.4.mlp.fc2.bias", "decoder_blocks.5.norm1.weight", "decoder_blocks.5.norm1.bias", "decoder_blocks.5.attn.qkv.weight", "decoder_blocks.5.attn.qkv.bias", "decoder_blocks.5.attn.proj.weight", "decoder_blocks.5.attn.proj.bias", "decoder_blocks.5.norm2.weight", "decoder_blocks.5.norm2.bias", "decoder_blocks.5.mlp.fc1.weight", "decoder_blocks.5.mlp.fc1.bias", "decoder_blocks.5.mlp.fc2.weight", "decoder_blocks.5.mlp.fc2.bias", "decoder_blocks.6.norm1.weight", "decoder_blocks.6.norm1.bias", "decoder_blocks.6.attn.qkv.weight", "decoder_blocks.6.attn.qkv.bias", "decoder_blocks.6.attn.proj.weight", "decoder_blocks.6.attn.proj.bias", "decoder_blocks.6.norm2.weight", "decoder_blocks.6.norm2.bias", "decoder_blocks.6.mlp.fc1.weight", "decoder_blocks.6.mlp.fc1.bias", "decoder_blocks.6.mlp.fc2.weight", "decoder_blocks.6.mlp.fc2.bias", "decoder_blocks.7.norm1.weight", "decoder_blocks.7.norm1.bias", "decoder_blocks.7.attn.qkv.weight", "decoder_blocks.7.attn.qkv.bias", "decoder_blocks.7.attn.proj.weight", "decoder_blocks.7.attn.proj.bias", "decoder_blocks.7.norm2.weight", "decoder_blocks.7.norm2.bias", "decoder_blocks.7.mlp.fc1.weight", "decoder_blocks.7.mlp.fc1.bias", "decoder_blocks.7.mlp.fc2.weight", "decoder_blocks.7.mlp.fc2.bias", "decoder_norm.weight", "decoder_norm.bias", "decoder_pred.weight", "decoder_pred.bias".

finetune with high resolution input

Hi there,

The defualt input size is 224 * 224 for the model.
I'm wondering if I can fine-tune the model with input image size 384 * 384. How do I modifty the code to implement.

Fail to run your visualize

I try to run the visualize, but with many errors in "Load a pre-trained MAE model" part.

Enquiry for a demo example

Hi, I'm kind of new to this field, is there any possibility that you could provide a demo example using your the OCT/CFP pretrained model and finetuned it on a specific dataset and then use it to predict images of the dataset using colab notebook?

Pre processing of fine-tuning datasets

Hi~Thank you very much for your meaningful work.

I noticed that you mentioned the preprocessing of CFP in "Data processing and augmentation for SSL".

I would like to know if this preprocessing still occurs during the fine-tuning stage? Can I request pre processing of relevant manuscripts or processed fine-tuning datasets?

Thank you very much for your help!

Fine-tune for downstream regression task (continuous labels)

Hi, thanks for your excellent work!

In the paper, you provide examples where the model is fine-tuned for downstream classification tasks (either binary diseases or multi-class labels). Is it possible to use the same framework but to fine-tune on continuous labels, for example predicting systolic blood pressure from fundus images? If so, could you please provide guidelines on how this should be done?

fine-tune in small sample size

Hello~Thank you very much for your meaningful work.
I wanted to reach out and seek your advice regarding an issue I am facing with overfitting in my model. Due to the limited size of my dataset, which consists of only a few hundred examples, I have opted to use transfer learning for modeling. However, after applying fine-tuning using the provided code, I have encountered overfitting with a test accuracy of only around 0.6.

I would greatly appreciate any suggestions or methods you may have to address overfitting in small datasets during the fine-tuning process.

Can you provide the checkpoint of systemic diseases prediction

Can you release your pre-trained best checkpoints of systemic diseases prediction? We want to test it on our own datasets.
Thank you

Finetuned result and Evaluation result is different

Hello, first of all, thank you so much for sharing this valuable and meaningful research.
The results of evaluation differ when using the ''RETFound_cfp_weights.pth'' file for fine-tuning and evaluating, compared to using the ''checkpoint-best.pth'' file saved during fine-tuning.
Is the ''checkpoint-best.pth'' file not the best parameter saved during the fine-tuning process of ''RETFound_cfp_weights.pth''?

Input image channel problem

Hello, thank you for providing such a great work. I've noticed that the retinal fundus images used have 3 channels. How should the code be adjusted when using OCT images with only 1 channel as input?

Would it be possible to opensource the DINO pretrained weights?

I would really appreciate it if you can open-source the DINO weights, with a contrastive learning-based approach like DINO we can explore unsupervised segmentation.

About a issue after running at terminal

Why i run a code with command :python -m torch.distributed.launch --nproc_per_node=1 --master_port=48798 main_finetune.py --batch_size 16 --world_size 1 --model vit_large_patch16 --epochs 50 --blr 5e-3 --layer_decay 0.65 --weight_decay 0.05 --drop_path 0.2 --nb_classes 5 --data_path ./Task1/ --task ./finetune_IDRiD/ --finetune ./RETFound_cfp_weights.pth --input_size 224
firstly,it prints lots of information including parameter ,model architecture
After that, it raise a error:
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/wuwentao/anaconda3/envs/retfound/bin/python', '-u', 'main_finetune.py', '--local_rank=0', '--batch_size', '16', '--world_size', '1', '--model', 'vit_large_patch16', '--epochs', '50', '--blr', '5e-3', '--layer_decay', '0.65', '--weight_decay', '0.05', '--drop_path', '0.2', '--nb_classes', '5', '--data_path', './Task1/', '--task', './finetune_IDRiD/', '--finetune', './RETFound_cfp_weights.pth', '--input_size', '224']' returned non-zero exit status 1.

Abt dataset

Thanks for u great job, may i ask a question : how do i get the dataset MEH-MIDAS?

evaluate problem

code：
class Config:
# Parameters based on provided command line
eval = True
batch_size = 16
model = 'vit_large_patch16'
epochs = 50
blr = 5e-3
layer_decay = 0.65
weight_decay = 0.05
drop_path = 0.2
nb_classes = 5
data_path = './IDRiD_data/'
task = './internal_IDRiD/'
resume = './finetune_task/checkpoint-best.pth'
input_size = 224
output_dir = './output_dir'
device = 'cuda'
seed = 0
num_workers = 10
pin_mem = True

def main(cfg):
print('Job directory:', os.path.dirname(os.path.realpath(file)))
print("Configuration:", vars(cfg))

device = torch.device(cfg.device)

# Fix the seed for reproducibility
seed = cfg.seed
torch.manual_seed(seed)
np.random.seed(seed)

cudnn.benchmark = True

dataset_test = build_dataset(is_train='test', args=cfg)

data_loader_test = torch.utils.data.DataLoader(
    dataset_test,
    batch_size=cfg.batch_size,
    num_workers=cfg.num_workers,
    pin_memory=cfg.pin_mem,
    drop_last=False
)

model = models_vit.__dict__[cfg.model](
    img_size=cfg.input_size,
    num_classes=cfg.nb_classes,
    drop_path_rate=cfg.drop_path,
    global_pool=True,
)
model.to(device)

misc.load_model(args=cfg, model_without_ddp=model, optimizer=None, loss_scaler=None)
print(model)
test_stats, auc_roc = evaluate(data_loader_test, model, device, cfg.task, epoch=0, mode='test', num_class=cfg.nb_classes)
print(f"Test Stats: {test_stats}, AUC-ROC: {auc_roc}")

i want to test evaluate and use the given checkpoint and data（idrid）
but when i run the code ，get：

How to use multiple GPUs for finetune?

I am a greenhand at coding, wondering how to use multiple GPUs for finetune?
I have 4 available GPUs and have set "--world_size" to 4, but it seems there is only 1 GPU being used.

Requires grad parameters to finetune a downstream classification task

Hello,

First of all thanks for sharing this great framework.

I was wondering which is the best approach to finetune the model from the pretrained weights for a binary classification task. Should we compute all parameters gradients (requires_grad=True)? or should we freeze some of the parts of the model?

Thanks in advance,
Regards,

How to use this to detect disease?

From the point:
RETFound has been validated in multiple disease detection tasks

How can i use this repo to detect disease?

fine-tuning weights

Hi~Thank you very much for your meaningful work.
i don't want to load the pre-training weights, but rather train the model directly from scratch.
how should i to do in the code?(●'◡'●)

Unable to get the model

I couldn't find this file. ./RETFound_cfp.pth', and would like to know how to download the checkpoint.

Can You Provide a Salient Region Visualization Demo?

Hi, this is an excellent piece of work. The paper demonstrates using RELPROP to visualize salient regions. I had issues with the visualization effect during my reproduction attempt and am unsure where the problem lies. Could you provide a demo?

Is it possible to share the dataset used in pretraining

The Moorfields diabetic image dataset (MEH-MIDAS) and public data (totalling 904,170 CFPs and 736,442 OCTs).

This dataset with 1.6 million samples will have significant impact if it is publicly available, I believe!

Dino, SwaV, Moco-v3

is there a code for Dino, SwaV, Moco-v3 ?

Thank you.

Finetuning for the downstream tasks

I'm trying to attach a CNN-based classifier/regressor for downstream tasks, to speed up I'm simply taking the output of the Retfound fc_norm layer as my input, which is the final hidden state of the transformers. However I'm not able to obtain legit results for simple classifications like sex or diabetes with CNNs or clustering(UMAP). And I have no clue how to verify if the hidden state output makes sense(maybe attention map visualization?)

rmaphoh / retfound_mae Goto Github PK

retfound_mae's Issues

Recommend Projects

Recommend Topics

Recommend Org