robvanvolt / dalle-models Goto Github PK
View Code? Open in Web Editor NEWHere is a collection of checkpoints for DALLE-pytorch models, from where you can keep on training or start generating images.
License: MIT License
Here is a collection of checkpoints for DALLE-pytorch models, from where you can keep on training or start generating images.
License: MIT License
During trying to load the dalle pre-trained weights, I meet a loading error, and it seems that the current dalle model has different structures and different weight keys. I'm guessing I am using the newer version of dalle-pytorch(dalle-pytorch-1.5.2). Can you specify the supported version of dalle-pytorch? Thank u so much!
generating images for - the grand canyon with snow on it. snow located on the grand canyon. a snowy grand canyon.: 0% 0/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
Traceback (most recent call last):
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/generate.py", line 116, in
output = dalle.generate_images(text_chunk, filter_thres = args.top_k)
File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/dalle_pytorch.py", line 42, in inner
out = fn(model, *args, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/dalle_pytorch.py", line 480, in generate_images
logits = self(text, image, mask = mask)[:, -1, :]
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/dalle_pytorch.py", line 552, in forward
out = self.transformer(tokens)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/transformer.py", line 142, in forward
return self.layers(x, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/reversible.py", line 156, in forward
out = _ReversibleFunction.apply(x, blocks, args)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/reversible.py", line 113, in forward
x = block(x, **kwarg)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/reversible.py", line 65, in forward
y1 = x1 + self.f(x2, record_rng=self.training, **f_args)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/reversible.py", line 40, in forward
return self.net(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/transformer.py", line 53, in forward
return self.fn(x, **kwargs) * self.scale
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/transformer.py", line 62, in forward
return self.fn(self.norm(x), **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/attention.py", line 362, in forward
out = self.attn_fn(q, k, v, attn_mask = attn_mask, key_padding_mask = key_pad_mask)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/ops/sparse_attention/sparse_self_attention.py", line 152, in forward
attn_output_weights = sparse_dot_sdd_nt(query, key)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/ops/sparse_attention/matmul.py", line 745, in call
time_db)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/ops/sparse_attention/matmul.py", line 549, in forward
c_time)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/ops/sparse_attention/matmul.py", line 188, in _sdd_matmul
_sparse_matmul.sdd_cache[key] = triton.kernel(
AttributeError: module 'triton' has no attribute 'kernel'
Why not hosting pretrained weights in github repository release? You can upload files having GBs of size into repository release. And using a simple get request function, weights can be downloaded into local. It would free us from the unnecessary wandb dependency when downloading a pretrained weight.
I can show you an example if you desire to go on this direction ๐
edit: Well this thing trained for a full 5 epochs and never made a single coherent generation. @rom1504 and I discussed this and I guess it's just really really hard to train this one. ๐คท Don't think a single RTX 2070 is going to cut it.
Hey - I'm trying to force myself to take a few days off and the holidays are coming up here in the states anyways so I'll probably be away from the discord/github for a bit.
pip install wandb requests
import wandb, requests
run = wandb.init()
artifact = run.use_artifact('dalle-pytorch-replicate/oi_gumbel_imgloss7/trained-dalle:v0', type='model')
artifact_dir = artifact.download()
import zipfile
with zipfile.ZipFile(path_to_zip_file, 'r') as zip_ref:
zip_ref.extractall(directory_to_extract_to)
openaiblog_openimages_bpe_url = "https://github.com/robvanvolt/DALLE-models/files/6735615/blogoimixer_4096.bpe.zip"
downloaded_obj = requests.get(openaiblog_openimages_bpe_url)
with open(openaiblog_openimages_bpe_url, "wb") as file:
file.write(downloaded_obj.content)
pip3 install youtokentome # will install the cli tool `yttm`
yttm bpe --vocab_size=4096 --coverage=1.0 --model=blogoimixer_4096.bpe --data=blogoi_allcaps.txt
deepspeed.runtime.ops.Adam(3e-4, betas=(0.9, 0.96), eps=1e-8))
# Warmup the learning rate from 1e-6 to 4e-3 for 2% of all steps then
# do Cosine decay back down to 1e-6 and train there until finished.
# Requires knowledge of the global/total step count. So you need to calculate it.
# Also requires deepspeed if you aren't into that.
total_num_steps = int(len(ds) * EPOCHS / (BATCH_SIZE * args.ga_steps)) # number of image-text pairs in a single epoch multiplied by the effective batch size
warmup_num_steps = total_num_steps * 0.02
deepspeed.runtime.ops.schedulers.WarmupLRDecay(
total_num_steps=total_num_steps,
warmup_steps=warmup_num_steps,
# CogView did 0 here? Anyway I stuck with @janEbert 1e-6 because a 0 learning rate doesn't improve anything.
min_lr=1e-6,
max_lr=LEARNING_RATE,
)
Hello, I used the colab notebook and choose 16L_64HD_8H_512I_128T_cc12m_cc3m_3E checkpoint and this error happened
script:
!python /content/dalle-pytorch-pretrained/DALLE-pytorch/generate.py --dalle_path=$checkpoint_path --taming --text="$text" --num_images=$num_images --batch_size=$batch_size --outputs_dir="$_folder"; wait;
Using the simplified colab results in the code running properly (generating an image with the default parameters in 34s, though I don't have a reference as to if that's functioning properly or not) but the generated image does not appear. I'm not sure if "attempting to display results" means the image should appear in the colab output, but it does not in this case. No "outputs" folder is generated, and locating the directory through the code gives "Could not fetch /content/outputs/ from the backend". Creating an outputs folder still results in no image being saved there after refreshing.
Apologies if this is a rookie mistake, I'm fairly new with using Colabs
Hi! I have seen your results of the trained model on generating the layouts from lucidrains, that's amazing! And I want to continue to train it or do some fine-tuning to see if it could generate better results, could you tell me how you trained that model, for example, where you found the 150k layouts dataset and how you make the different areas into a row in different colours and let the model learn the different areas? Could you please provide the checkpoint of the model or tell me how to access the trained model?
Follow the message (added some print statements to debug and removed clear_output) - Please advise
chosen_model: https://www.dropbox.com/s/8mmgnromwoilpfm/16L_64HD_8H_512I_128T_cc12m_cc3m_3E.pt?dl=1 folder_ /content/outputs/Cucumber_on_a_brown_wooden_chair/ Traceback (most recent call last): File "/content/dalle-pytorch-pretrained/DALLE-pytorch/generate.py", line 18, in <module> from dalle_pytorch import DiscreteVAE, OpenAIDiscreteVAE, VQGanVAE, DALLE File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/__init__.py", line 1, in <module> from dalle_pytorch.dalle_pytorch import DALLE, CLIP, DiscreteVAE File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/dalle_pytorch.py", line 11, in <module> from dalle_pytorch.vae import OpenAIDiscreteVAE, VQGanVAE File "/content/dalle-pytorch-pretrained/DALLE-pytorch/dalle_pytorch/vae.py", line 14, in <module> **from taming.models.vqgan import VQModel, GumbelVQ** ImportError: cannot import name 'GumbelVQ' from 'taming.models.vqgan' (/usr/local/lib/python3.7/dist-packages/taming/models/vqgan.py)
change
!wget "https://github.com/lucidrains/DALLE-pytorch/archive/refs/tags/0.14.3.zip" -O /content/
to
!wget "https://github.com/lucidrains/DALLE-pytorch/archive/refs/tags/0.14.3.zip" -O /content/0.14.3.zip
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.