sarahesl / pubmedclip Goto Github PK

Fine-tuning CLIP using ROCO dataset which contains image-caption pairs from PubMed articles.

License: MIT License

Python 96.74% Shell 3.26%

pubmedclip's Issues

No such operator image::read_file

hello, me again.
When I ran main.py in PubMedCLIP with vision encoder ViT-B/32, something wrong:

-------Loading CLIP with vision encoder ViT-B/32 -------
-------Training started-------
Traceback (most recent call last):
  File "C:/Users/fhdu/PycharmProjects/PubMedCLIP/main/main.py", line 51, in <module>
    train(cfg, train_loader, val_loader, device)
  File "C:\Users\fhdu\PycharmProjects\PubMedCLIP\main\train.py", line 65, in train
    for i, (image, caption) in enumerate(train_loader):
  File "E:\SoftwareFile\Anaconda3\envs\med_vqa\lib\site-packages\torch\utils\data\dataloader.py", line 521, in __next__
    data = self._next_data()
  File "E:\SoftwareFile\Anaconda3\envs\med_vqa\lib\site-packages\torch\utils\data\dataloader.py", line 561, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "E:\SoftwareFile\Anaconda3\envs\med_vqa\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "E:\SoftwareFile\Anaconda3\envs\med_vqa\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\fhdu\PycharmProjects\PubMedCLIP\lib\dataset\ROCOdataset.py", line 80, in __getitem__
    **image = self._load_image(index)**
  File "C:\Users\fhdu\PycharmProjects\PubMedCLIP\lib\dataset\ROCOdataset.py", line 70, in _load_image
    **image = read_image(path, mode=ImageReadMode.RGB)**
  File "E:\SoftwareFile\Anaconda3\envs\med_vqa\lib\site-packages\torchvision\io\image.py", line 222, in read_image
    data = read_file(path)
  File "E:\SoftwareFile\Anaconda3\envs\med_vqa\lib\site-packages\torchvision\io\image.py", line 42, in read_file
    data = torch.ops.image.read_file(path)
  File "E:\SoftwareFile\Anaconda3\envs\med_vqa\lib\site-packages\torch\_ops.py", line 63, in __getattr__
    op = torch._C._jit_get_operation(qualified_op_name)
RuntimeError: No such operator image::read_file

Not sure the wrong version of torch leads to this issue? my version:
torch 1.10.0 , torchvision 0.11.1, according to your requirement.txt

Looking forward to your answer, thanks in advance.

Questions about reproducibility

Hello, thank you so much for your work!

I tried to reproduce your results and I'm not able to do it so far. I followed the instructions for QCR on the SLAKE dataset and my results show almost the same overall accuracy for CLIP and PubMedCLIP (using your checkpoints) and even a little lower for PubMedCLIP, both around 79% overall accuracy. Any advice?

Also, I tried to train my own PubMedCLIP using your code (following the instructions on the PubMedCLIP subdirectory) and it seems to be overfitting in the first 5 epochs for every model, was it similar in your experiments? These models also performed similar to CLIP and your checkpoints of PubMedCLIP on QCR-SLAKE.

Thanks

VQA_RAC preprocessing dictionary dimension

Hi,
After following the instruction you provided in 'QCR_PubMedCLIP' folder, we got the following error
/content/PubMedCLIP/QCR_PubMedCLIP loading dictionary from ./data/data_rad/dictionary.pkl loading DAE image data from file: ./data/data_rad/images128x128.pkl loading CLIP image data from file: ./data/data_rad/images250x250.pkl loading DAE image data from file: ./data/data_rad/images128x128.pkl loading CLIP image data from file: ./data/data_rad/images250x250.pkl Traceback (most recent call last): File "main/main.py", line 85, in <module> question_classify.load_state_dict(pretrained_model) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for classify_model: size mismatch for w_emb.emb.weight: copying a param with shape torch.Size([1178, 300]) from checkpoint, the shape in current model is torch.Size([1260, 300]).

And we found this bug is caused by the newly generated dictionary. The size is different from your saved_model. Could you help me with it?

Use PubMedCLIP as a pretrained model

Hello. Thank you for this research.
I have a question about this work. I try to use PubMedCLIP as a pretrained model for extracting features from medical images and questions but I can't. Can you help me to solve this problem? How can I use this model as a pretrained model?
Thanks

Cannot access trained models on GDrive

Hi,
Very interesting work.
For some reason, GDrive does not give access to your trained models.
Can you please fix that?

Thanks,

Issue with VQA RAD training

Hi, I have the same problem as #8 (comment), and I can not solve this problem by re-running the script.

Traceback (most recent call last):
  File "main/main.py", line 85, in <module>
    question_classify.load_state_dict(pretrained_model)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for classify_model:
        size mismatch for w_emb.emb.weight: copying a param with shape torch.Size([1178, 300]) from checkpoint, the shape in current model is torch.Size([1260, 300]).

In https://github.com/sarahESL/PubMedCLIP/blob/main/QCR_PubMedCLIP/lib/utils/create_dictionary.py, create_dictionary function use both train / test file to create dictionary (with nvocab = 1260). But in train code, the tf-idf loading module uses only train set (nvocab = 1178). I guess that this problem is due to the difference between the question used in the create dictionary and the question set used in the tf-idf calculation. Could you please solve this problem?

Validation Set for SLAKE dataset

This repository is great! I was able to reproduce the results for the SLAKE dataset and the model seems to work well. However, I was wondering about the validation data. It seems like in line 65 in main.py for QCR_PubMedCLIP, the test dataset is used for validation and best model selection. Also in the setup script, it seems like the img2idx dictionary is created using train and test data instead of validation. Is this supposed to be the case?

No such file or directory:radiologytraindata.csv

Hi,Do you know how to generate the CSV file

Issues with lib/utils/run.sh for input files

Hello,
I'm having trouble running the script for generating the input files (dictionaries, and pickles). One problem in lib/utils/run.sh runs python create_dictionary.py "../../data/data_slake" instead of /data_rad/. However, after changing the directory to data_rad more issues pop up including df columns not matching and I understand that most likely the file was not modified perfectly for generating both datasets' inputs.

It took a while to get them all working, but it would be nice to see a fix.

How to save a model in local machine ?

why can run.sh create datasets files?

hi there, I used to train the MEVF in my local machine, as MEVF's README.md wrote:

All data should be downloaded via link. The downloaded file should be extracted to data_RAD/ directory.

so I really want to know in your repository how the run.sh can create datasets files or some dictionaries files(I thought it should have a url at least?), I think this ought to be a low-level question(:
appreciated for your answer.

imgid2idx.json

How is imgid2idx.json generated

pretrained_ae.pth

How to find the two files pretrained_ae.pth and pretrained_maml.pth, thank you for telling me

Validation set & test/train answer differences

Hello,

First off, thanks for this contribution and for making the code public.

I have some questions regarding the validation of your model on VQA-Rad:

First, I wasn’t able to find anything about a validation set. Is the current setup that the model is validated and tested on the same test set?

Second, as I understand it the problem is set up as classification of all the possible answers present in the dataset. However, I noticed that there are many answers in the test set that are not present in the train set. Wouldn’t this mean that it’s impossible (as long as we don’t include any textual embedding of the answers) for the model to predict these answers correctly?

Thanks!

Question about answer+"#"

Hi, Thank you for sharing the wonderful work in the medical field. I have one question.

In QCR_PubMEDCLIP.lib.utils.create_label.py <line 200>

What is the reason for adding the "#" string to the answer file in an open answer? (for Slake dataset)

for answer in open_answers: if answer in ans2label: ans = answer + "#"

embed_tfidf_weights

How to find the this files embed_tfidf_weights , thank you for telling me

sarahesl / pubmedclip Goto Github PK

pubmedclip's Issues

No such operator image::read_file

Questions about reproducibility

VQA_RAC preprocessing dictionary dimension

Use PubMedCLIP as a pretrained model

Cannot access trained models on GDrive

Issue with VQA RAD training

Validation Set for SLAKE dataset

No such file or directory:radiologytraindata.csv

Issues with lib/utils/run.sh for input files

How to save a model in local machine ?

why can run.sh create datasets files?

imgid2idx.json

pretrained_ae.pth

Validation set & test/train answer differences

Question about answer+"#"

embed_tfidf_weights

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent