zasder3 / train-clip Goto Github PK

View Code? Open in Web Editor NEW

648.0 648.0 78.0 248 KB

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

License: MIT License

Python 100.00%

train-clip's People

Contributors

Stargazers

Watchers

Forkers

afiaka87 bob80333 trendingtechnology zlapp tiamat-tech sunggukcha dadounhind frederikschubert strategist922 jayralencar anas-awadalla melnikoleg parkitny ethan-jiang-1 maxylee sinjohr metavai rocke2020 rtvt123 randomwalker300 wayne980 txwss jungyitsai sour4bh zhyj3038 gsav90 qianqi1212 xiedake guoyanhui03 ahmedewis manangoel99 peternara hexiaohao moslehpour iranroman mbencherif nikunj-gupta lily11223344 duwizerak jellchou zhan0903 ell-hol mulkong myscale krilecy thegenerativegeneration antitheos kumapowerliu ffffffffchopin garrett-yoon mattias421 sybaee wfang11 shahabmokari happyxy blagowhatnow shreyaachauhan scutjinchengli tengyuantuohai-113 abhinav70291 ngthanhtin paulmarkakis-duke caisarl76 ker2xu listever dongso ryno666 whuhxb philipguo94 zhuqianglu dermatologist tomogwen baochaozhu sifubro sabraha2 huynt654

train-clip's Issues

How to load a provided CLIP pre-trained model in your code?

Hi Thanks for sharing, the code is neat and easy to follow. I have one question regarding fine-tuning a pre-trained CLIP.
I notice that in your train_finetune.py, instead of directly loaded a pre-trainend CLIP model you encode two separately defined image encoder and text encoder. I wonder if I want to fine-tune a specific pre-trained CLIP model such as "ViT/32B", how I can properly load the image encoder and text encoder? Thank you for your answer.

About the accuracy computation.

Thanks for sharing you code. I am a little bit confused about the accuracy computation in L69-70 in wrapper.py:

acc_i = (torch.argmax(image_logits) == ground_truth).sum()
acc_t = (torch.argmax(image_logits.t()) == ground_truth).sum()

It seems that torch.argmax retures the max value index accross all dimensions while ground_truth is with each row or column. Should we change to?

acc_i = (torch.argmax(image_logits, 0) == ground_truth).sum()
acc_t = (torch.argmax(image_logits.t(), 0) == ground_truth).sum().

Thanks.

WebDataset support

I think it could be pretty useful to add a webdataset loader to this, so webdataset datasets can be used here.
This is relevant as large webdataset are starting to be available (one is crawling at home of size 400M)

I think https://github.com/lucidrains/DALLE-pytorch/pull/280/files may be a good example on how to do it

Fine-tune with a custom dataset

Hi,
I try to fine tune the clip with a very small custom dataset. I ran the command "python train_finetune.py --folder /home/ionur2/train-CLIP/data --batch_size 2" and I get an assertion error :

Is there anyone facing the same issue?
I would be glad if you help me with this!

Type Error while training

File "/home/rishabh/Rishabclip/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 2430, in call
"text input must of typestr(single example),List[str] (batch or single pretokenized example) "
ValueError: text input must of type str(single example),List[str](batch or single pretokenized example) orList[List[str]] (batch of pretokenized examples).

Could you provide what are the versions of tokenizer and torch used in your code

MisconfigurationException: `train_dataloader` must be implemented to be used with the Lightning Trainer

i am trying to run train a model using the following command
python train.py --model_name RN50 --folder ArchDaily --batch_size 512 --accelerator cuda

i get the above error:
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/hooks.py", line 485, in train_dataloader
raise MisconfigurationException("train_dataloader must be implemented to be used with the Lightning Trainer")
pytorch_lightning.utilities.exceptions.MisconfigurationException: train_dataloader must be implemented to be used with the Lightning Trainer

grateful for any assistance

and here is full messages log

Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/configuration_validator.py:119: PossibleUserWarning: You defined a validation_step but have no val_dataloader. Skipping val loop.
category=PossibleUserWarning,
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Traceback (most recent call last):
File "train.py", line 31, in
main(args)
File "train.py", line 20, in main
trainer.fit(model, dm)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 701, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 654, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 741, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1147, in _run
self.strategy.setup(self)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/single_device.py", line 74, in setup
super().setup(trainer)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 153, in setup
self.setup_optimizers(trainer)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 142, in setup_optimizers
self.lightning_module
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/optimizer.py", line 179, in _init_optimizers_and_lr_schedulers
optim_conf = model.trainer._call_lightning_module_hook("configure_optimizers", pl_module=model)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1549, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/content/train-CLIP/models/wrapper.py", line 146, in configure_optimizers
first_cycle_steps=self.num_training_steps,
File "/content/train-CLIP/models/wrapper.py", line 38, in num_training_steps
dataset = self.train_dataloader()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/hooks.py", line 485, in train_dataloader
raise MisconfigurationException("train_dataloader must be implemented to be used with the Lightning Trainer")
pytorch_lightning.utilities.exceptions.MisconfigurationException: train_dataloader must be implemented to be used with the Lightning Trainer

Looks like loss is wrong

Isn't the similarity mat sharded? So you would gather after image_logits = torch.cat(ims) @ torch.cat(txt).t() (line 66 wrapper.py) not before.

Sinkhorn motivation

Hi, thanks for the awesome implementation!

I had a question with sinkhorn objective used in CustomWrapper. Is it motivated from here? Would be great if you could mention bit more about it.

Cheers!

Validation dataloader

Hello

Is there a way to include data for validation while training?
the readme suggests passing the --folder argument. do i have to make a validation folder within the data directory?

Your help would be highly appreciated

多gpu从0训练出现CUDA error: device-side assert triggered

错误信息：
File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2846, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: CUDA error: device-side assert triggered

训练脚本：
python train.py --folder data_dir --model_name ViT-B/32 --batch_size 1024 --gpus 4 --strategy ddp --num_workers 16

如何解决？（单gpu训练没有问题）

How to wirte the inference code

Hi:
thank you very much for share your good code, i had trained the RN50 model with the CustomCLIPWrapper and I want to know how to write the inference code?

thank you very much for your apply!

Loading checkpoint

Hi there. Can you give some advice in how to load a checkpoint from a trained model with your pytorch lightning wrapper for inference? I used the common pytroch lightning method "load_from_checkpoint" but did not have any luck so far. Thanks

how to train on cpu?

manual_backward + fp16 training doesn't converge

Hi, I borrowed some snippets from your codebase for the distributed GPU and minibatch-within-batch training in my own project. However, I found that training using manual_backward() + FP16 does not converge at all. If I switch to FP32, training works without any other code modifications. I'm using the latest pytorch-lightning v1.6.3. I wonder if you have observed similar issues?

Multi GPU training

Thanks for sharing the code.

I am not familiar with Lightning. It seems that the Code supports multiGPU (

train-CLIP/models/wrapper.py

Line 43 in 8d454de

# Multi-GPU support: https://github.com/MicPie/clasp

), but I am not sure how to initiate the multi-GPU training.

Besides, just to confirm, the code does not initialize the weights using the pretrained model?

COCO-style DataLoader

I would love to start training with this! I helped to write a Dataloader for the "COCO" format i.e. images and text files containing line separated captions. They are matched in the data loader via the unique basename of each file.

https://github.com/lucidrains/DALLE-pytorch/blob/main/dalle_pytorch/loader.py

Would it be possible to port that data loader to this project? It is perhaps of interest to some folks I know with some spare compute. Also personally useful to me, because I have converted a good deal of my collected datasets to this format already.

Thanks!

Problem related to encoding text

I am trying to use a resnet50 model that I created with this repo, but I can't encode text.

with torch.no_grad():
    tmp = clip.tokenize("test")
    tmp = tmp.to(device)
    print(tmp)
    print(tmp.shape)
    text_encoded = model.model.encode_text(tmp)

tensor([[49406,  1628, 49407,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0]], device='cuda:0')
torch.Size([1, 77])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-68003eb3bebb> in <module>()
      9     print(tmp)
     10     print(tmp.shape)
---> 11     text_encoded = model.model.encode_text(tmp)
     12 

2 frames
/content/train-CLIP/models/model.py in encode_text(self, text)
    343         x = x + self.positional_embedding.type(self.dtype)
    344         x = x.permute(1, 0, 2)  # NLD -> LND
--> 345         x = self.transformer(x)
    346         x = x.permute(1, 0, 2)  # LND -> NLD
    347         x = self.ln_final(x).type(self.dtype)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/transformers/models/bert/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
    937         elif input_ids is not None:
    938             input_shape = input_ids.size()
--> 939             batch_size, seq_length = input_shape
    940         elif inputs_embeds is not None:
    941             input_shape = inputs_embeds.size()[:-1]

ValueError: too many values to unpack (expected 2)

Printing x before self.transformer(x) results in torch.Size([77, 1, 512]).

The input shape torch.Size([1, 77]) does match the original clip code and the model loaded with clip seems to work without major problems.

import torch
import clip
from PIL import Image

device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device, jit=False)

image = preprocess(Image.open("/test.png")).unsqueeze(0).to(device)
text = clip.tokenize(["test"]).to(device)
print(text)
print(text.shape)

with torch.no_grad():
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    
    logits_per_image, logits_per_text = model(image, text)
    probs = logits_per_image.softmax(dim=-1).cpu().numpy()

tensor([[49406,  1628, 49407,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0]], device='cuda:0')
torch.Size([1, 77])

Not sure what I am doing wrong, since encoding images does seem to work fine with this repo.

with torch.no_grad():
    photos_features = model.model.encode_image(image)
    photos_features /= photos_features.norm(dim=-1, keepdim=True)

print(photos_features.shape)

torch.Size([1, 768])

Explanation

Can you please add the little more detailed explanation of the code and the self-distillation technique used to make it more efficient.
Various parts of the code are somewhat hard to understand and thus I would request you add some explanation regarding the code as well.

Dataset structure

Hi I'm having a little trouble understanding the dataset structure that I should follow in order to be able to train with this package. Is it one parent folder, one folder containing images and one folder containing their text files?
If yes, what should these subfolders be named?

model checkpointing

Hey, Thank you for the lightning implementation, just what I needed at the moment!
However, I'm a little confused about model checkpointing. I would assume it automatically saves the checkpoint to lightning_logs/checkpoints/, however after a full training run I didn't find anything saved in the checkpoints folder.
I'm taking a deeper look into the repo and from first glance, I can see you didn't override that hook. I'm guessing the default checkpointing hook would not work since this is self-distillation (I'm using train_finetune.py btw)
Let me know in case this is not expected behaviour.

Assertion error

Hi, Can somebody please help me out here why this error is coming?

Using native 16bit precision.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
/home/ubuntu/.local/lib/python3.6/site-packages/pytorch_lightning/trainer/configuration_validator.py:101: UserWarning: you defined a validation_step but have no val_dataloader. Skipping val loop
rank_zero_warn(f"you defined a {step_name} but have no {loader_name}. Skipping {stage} loop")
Path FeatureStore
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Traceback (most recent call last):
File "train_finetune.py", line 33, in
main(args)
File "train_finetune.py", line 23, in main
trainer.fit(model, dm)
File "/home/ubuntu/.local/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 552, in fit
self._run(model)
File "/home/ubuntu/.local/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 873, in _run
self.accelerator.setup(self, model) # note: this sets up self.lightning_module
File "/home/ubuntu/.local/lib/python3.6/site-packages/pytorch_lightning/accelerators/gpu.py", line 42, in setup
return super().setup(trainer, model)
File "/home/ubuntu/.local/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 88, in setup
self.setup_optimizers(trainer)
File "/home/ubuntu/.local/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 331, in setup_optimizers
trainer=trainer, model=self.lightning_module
File "/home/ubuntu/.local/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 223, in init_optimizers
return trainer.init_optimizers(model)
File "/home/ubuntu/.local/lib/python3.6/site-packages/pytorch_lightning/trainer/optimizers.py", line 34, in init_optimizers
optim_conf = model.configure_optimizers()
File "/home/ubuntu/clip/train-CLIP/models/wrapper.py", line 343, in configure_optimizers
warmup_steps=2000
File "/home/ubuntu/.local/lib/python3.6/site-packages/cosine_annealing_warmup/scheduler.py", line 27, in init
assert warmup_steps < first_cycle_steps
AssertionError

How many images and captions are required to train my own CLIP?

Hello, I am Yong, a computer vision researcher.

I was impressed about your code and wondered about how to fine-tune CLIP.
I want to classify images through CLIP.
I only have at most 10 images per class; there are 4-6 classes in total.

In this situation, fine-tuning is possible?

Thank you.

About Gradient Checkpointing

I wrote the codes about gradient checkpointing.
Can I contribute some codes to this project?

Image encoder

Is it possible to use a pre-trained image model from Hugging Face when trying to fine-tune? The latest models are usually there, so it would be pretty cool if it was compatible.

"open_clip"

https://github.com/mlfoundations/open_clip

About Training

Thanks for the awesome implementation!
After training in my own datasets, the classification acc1 of model is bad. I do not know how to analysis the problem. When finetuning CLIP, is there some criterion to measure whether the model has been trained well?
Thanks for your answers ahead of time.

ModuleNotFoundError: No module named 'clip'

Thanks for your great work! when I call train.py with my dataset. I have ModuleNotFoundError: No module named 'clip'. Do I miss something?

Error occurs when using DeepSpeed

Hi @Zasder3, thank you for the great work!

I was wondering if you tried to use DeepSpeed because I saw this commit log (DeepSpeed Optimizer indexing).
When I tried DeepSpeed by adding --plugins deepspeed_stage_2, I've got below errors.

Traceback (most recent call last):
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 8
71, in run_train
    self.train_loop.run_training_epoch()
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py",
line 499, in run_training_epoch
    batch_output = self.run_training_batch(batch, batch_idx, dataloader_idx)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py",
line 743, in run_training_batch
    self._curr_step_result = self.training_step(
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py",
line 290, in training_step
    training_step_output = self.trainer.accelerator.training_step(args)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py
", line 204, in training_step
    return self.training_type_plugin.training_step(*args)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/ddp.p
y", line 337, in training_step
    return self.model(*args, **kwargs)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _c
all_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1105, in f
orward
    loss = self.module(*inputs, **kwargs)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _c
all_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/deeps
peed.py", line 62, in forward
    return super().forward(*inputs, **kwargs)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/overrides/base.py", line 46
, in forward
    output = self.module.training_step(*inputs, **kwargs)
  File "/home/shared/workspace/multimodal-matching/multimodal-matching/train-CLIP/models/wrapper.py", line 106,
 in training_step
    self.manual_backward(loss)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/core/lightning.py", line 12
52, in manual_backward
    self.trainer.train_loop.backward(loss, optimizer=None, opt_idx=None, *args, **kwargs)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py",
line 867, in backward
    self.trainer.accelerator.backward(result, optimizer, opt_idx, should_accumulate, *args, **kwargs)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py
", line 306, in backward
    self.training_type_plugin.pre_backward(closure_loss, should_accumulate, optimizer, optimizer_idx)
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/ddp.p
y", line 311, in pre_backward
    if not self.lightning_module.automatic_optimization and self.model.require_backward_grad_sync:
  File "/opt/conda/envs/clip_cuda11.1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in __
getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'DeepSpeedEngine' object has no attribute 'require_backward_grad_sync'

The error occurs in the below line, where we use self.automatic_optimization = False.

train-CLIP/models/wrapper.py

Line 81 in ab1c593

self.manual_backward(loss)

I could use DeepSpeed by self.automatic_optimization = True without self.manual_backward(loss). (But still need some debugging because the training pattern changes.)

My working environment are pytorch=1.9, cuda=11.1, pytorch-lightning=1.3.8.
Thanks in advance!

NotImplementedError: `train_dataloader` must be implemented to be used with the Lightning Trainer?

Traceback (most recent call last):
File "train_finetune.py", line 39, in
main(args)
File "train_finetune.py", line 29, in main
trainer.fit(model, dm)
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 741, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1145, in _run
self.accelerator.setup(self)
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu.py", line 46, in setup
return super().setup(trainer)
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 93, in setup
self.setup_optimizers(trainer)
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 355, in setup_optimizers
trainer=trainer, model=self.lightning_module
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 245, in init_optimizers
return trainer.init_optimizers(model)
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/trainer/optimizers.py", line 35, in init_optimizers
optim_conf = self.call_hook("configure_optimizers", pl_module=pl_module)
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1501, in call_hook
output = model_fx(*args, **kwargs)
File "/workspace/cpfs-data/workspace_pytorch/hclip/train-CLIP/models/wrapper.py", line 337, in configure_optimizers
first_cycle_steps=self.num_training_steps,
File "/workspace/cpfs-data/workspace_pytorch/hclip/train-CLIP/models/wrapper.py", line 38, in num_training_steps
dataset = self.train_dataloader()
File "/workspace/cpfs-data/miniconda3/envs/tensorflow/lib/python3.7/site-packages/pytorch_lightning/core/hooks.py", line 477, in train_dataloader
raise NotImplementedError("train_dataloader must be implemented to be used with the Lightning Trainer")
NotImplementedError: train_dataloader must be implemented to be used with the Lightning Trainer

这里student和teacher是不是搞反了？

train-CLIP/models/wrapper.py

Line 292 in 8146438

teacher.data.copy_(self.ema(teacher.data, student.data))

custom tokenizer and text encoder

I want to use custom tokenizer and encoder trained from huggingface tokenizer.

After training the huggingface tokenizer, I got a json containing vocas.

However, I don't know how to feed this custom tokenizer with train_finetune.py.

Could you give some guide to set and use custom tokenizer?

No logging and validation?

Weirdly enough there seems to be no tensorboard logging and validation.
Is this the case or I didn't run something correctly?

customwrapped load checkpoint test

I have some problems about how to load checkpoing with Customwrapped. Could give some code example to me？ I would be very grateful to you

what's the meaning of minibatch_size?

Thank you for your CLIP training code! That's great!

Training with your new commit 8d454de code, I get the following error:
RuntimeError: The expanded size of the tensor (0) must match the existing size (8) at non-singleton dimension 0. Target sizes: [0, 1024]. Tensor sizes: [8, 1024]

images_tmp[self.global_rank][j*self.minibatch_size:(j+1)*self.minibatch_size] = F.normalize(self.model.encode_image(mb), dim=1)
minibatch_size = 0
Would you please explain the meaning of mimibatch_size ? How to use minibatch_size?

zasder3 / train-clip Goto Github PK

train-clip's People

Contributors

Stargazers

Watchers

Forkers

train-clip's Issues

and here is full messages log

Recommend Projects

Recommend Topics

Recommend Org