aelnouby / text-to-image-synthesis Goto Github PK

View Code? Open in Web Editor NEW

391.0 391.0 89.0 465 KB

Pytorch implementation of Generative Adversarial Text-to-Image Synthesis paper

License: GNU General Public License v3.0

Python 100.00%

gans image-generation pytorch text-to-image zero-shot-learning

text-to-image-synthesis's People

Contributors

Stargazers

Watchers

Forkers

zgsxwsdxg ilovecv flrgsr csc2548 junweima changebio zcrwind sriharsha-sammeta seigercom wkawai youndoldman thorjonsson gjyin youngkuan smallflyingpig mkumar10 entavelis auserj christinaliang indrajitharidas brucew91 jjw-megha wangzhuoxian chitrita amirunpri2018 dantordj caolusg lychees liulaha cuijie12358 zsquaredz lepoeme20 williamhoo kaiqiao1992 minblock r3a2t10 nazifberat divergent63 fgabel johnhany b2220333 imyzx2017 xyzhou-puck guanfangdong thesyncoder 14maverick04 jcharante anubhav2000 lingjun3033 nikh1l nianweijie fushier yoongun ysig munia-ak hummingbird2012 shahhaard47 foundasion dizidizidi thescott463 sizhky rajeshdiwakar andoleg xishunzhu daniellin94144 hadesi zxs789 const7 hyunwooha98 yiqunzhang nonbiuld cbadjatya datadeus youxz1999 mirrortower sarthak42 suhyeonha ku-haeeun gowriaddepalli detrading haokok sejalahire iq-scm kyunghoyu byebyesportman peterzs rajibdas-123 haticekubrakilinc spongebob-1703

text-to-image-synthesis's Issues

A question about text and image encoding

when train the GAN, i did not see text encoding process. you store text in sample.
` right_images = sample['right_images']
right_embed = sample['right_embed']
wrong_images = sample['wrong_images']
```
         right_images = Variable(right_images.float()).cuda()
         right_embed = Variable(right_embed.float()).cuda()
         wrong_images = Variable(wrong_images.float()).cuda()
```

`
There is no text_embedding.
2. when we train the GAN, should we extract text and image embedding first or we simplely put row images into the network?

How to add embedding path?

what is the embedding path for custom dataset prep

When I want to use the data file provided by author,my IDE report bug as "RuntimeError: Unable to get group info (wrong B-tree signature)" below.
import h5py f = h5py.File('flowers.hdf5', 'r') for k in f.keys(): print(f["test"])
Traceback (most recent call last): File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "F:\Anaconda\lib\site-packages\h5py\_hl\group.py", line 623, in __repr__ r = '<HDF5 group %s (%d members)>' % (namestr, len(self)) File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "F:\Anaconda\lib\site-packages\h5py\_hl\group.py", line 443, in __len__ return self.id.get_num_objs() File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5g.pyx", line 336, in h5py.h5g.GroupID.get_num_objs RuntimeError: Unable to get group info (wrong B-tree signature)

Then I use vitables to check if there has any problems in this datafile,the result is that i can only see the groups named train,test,valid without dataset inside like this .

I am a freshman in using such things.Could anyone tell me where is my problem?

Q. about the preprocess of Data

Hi, @aelnouby
Thank you for your sharing.
In the process of getting data, I want to Convert the data by myself, so I download the dataset as the describe of https://github.com/reedscot/cvpr2016 , then I got a link (flowers , for example) https://drive.google.com/open?id=0B0ywwgffWnLLcms2WWJQRFNSWXM and download a file named cvpr2016_flowers.tar.gz , then I unzipped it and the unzipped folder looked like following

I can't understand how you organized your file to excute the script 'convert_flowers_to_hd5_script.py' correctly . Did I download a wrong file ?
Looking forward your reply!

TypeError: a bytes-like object is required, not 'str' in "Text2ImageDataset.py"

I am getting the above error for below line in above file

the error:

what should I do for special value and also I am not understanding what does it replacing using special as it is unicode?

How do we test the model?

Can you please tell me how do I test it? I need to give an input and see the generated image.

Double about the generation of fake image in a mini-batch

Hi,
When training , you generate fake image twice in a mini-batch, however according to the paper is seems like when updating D and G, they both use the same fake image, so i'm confused about it..
I think generating fake image twice may increase the instability of training.

Hoping for your reply.

Mapping to dataset folders.

What is the file of text embeddings?

Do you mean the text embedding is text encoders for birds and flowers? And how to use these two .t7 files?

Thanks a lot.

torch model

ModuleNotFoundError: No module named 'torch.utils.serialization'

cannot identify image file <_io.BytesIO object at 0x7f2d80b2e770>

Thanks for your contribution!
But when I run the code, an error occurs:

Traceback (most recent call last):
File "runtime.py", line 42, in
trainer.train(args.cls)
File "/home/zzw/program/text2img/text-to-Image-Synthesis-pytorch/trainer.py", line 65, in train
self._train_wgan(cls)
File "/home/zzw/program/text2img/text-to-Image-Synthesis-pytorch/trainer.py", line 103, in _train_wgan
sample = next(data_iterator)
File "/home/zzw/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 281, in next
return self._process_next_batch(batch)
File "/home/zzw/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 301, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
IOError: Traceback (most recent call last):
File "/home/zzw/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 55, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/zzw/program/text2img/text-to-Image-Synthesis-pytorch/txt2image_dataset.py", line 46, in getitem
right_image = Image.open(io.BytesIO(right_image)).resize((64, 64))
File "/home/zzw/.local/lib/python2.7/site-packages/PIL/Image.py", line 2590, in open
% (filename if filename else fp))
IOError: cannot identify image file <_io.BytesIO object at 0x7f1be1801770>

My Pillow version is 5.1.0, and it seems like something related to the version.
Could anyone help me out?

Loss function

Hi,
Thanks for the nice and very helpful work! I am also trying to do text-to-image generation, but on tensorflow.
My loss graphs are going totally wrong:

The loss functions I am using are:

d_loss_real = tf.reduce_mean(disc_real_image_logits)
d_loss_fake = tf.reduce_mean(disc_fake_image_logits)
d_loss_wrong = tf.reduce_mean(disc_wrong_image_logits)
d_w_loss = d_loss_fake + d_loss_wrong - d_loss_real
g_w_loss = -1*(d_loss_fake)

For optimisation I am using following lines:

rms_d_optim = tf.train.RMSPropOptimizer(learning_rate=5e-5).minimize(  loss['d_loss'],var_list=variables['d_vars'])
rms_g_optim = tf.train.RMSPropOptimizer(learning_rate=5e-5).minimize(loss['g_loss'], var_list=variables['g_vars'])
d_clip = [v.assign(tf.clip_by_value(v, -args.d_clip_limit, args.d_clip_limit)) for v in variables['d_vars']]
with tf.control_dependencies([rms_d_optim]):
      rms_d_optim = tf.tuple(d_clip)
for epoch in range(100):
           for diter in range(10):
                       sess.run([rms_d_optim],feed_dict=feed)
                       sess.run(d_clip)
             sess.run([rms_g_optim],feed_dict=feed)

Could you suggest some direction for fixing this?
I went through your code (I am not well versed with PyTorch as of now), and it seems that you are also using same losses. Please correct me if I am mistaken.
Thanks

Implement Minibatch discrimination

Can you run this without nvidia?

I've tried to run your code but it required nvidia and I don't have a nvidia graphic card ; can this work with other types of GPU?
Thanks

Can't find the file:"text_c10"

Hi, I'm learning text-to-image, and thanks a lot for your reimplement in pytorch!
However, i can not find the text_c10 files for birds and flowers dataset, is it the descriptions of the image?
It looks like the author of the icml2016 have not provide the original descriptions but the text embedding in torch format.

Watting for your reply.

TypeError;function takes exactly 5 arguments (1 given)

Hi!
I have met a very weird problem that the model could be trained several times but soon it came into error below.
hope for your reply!
Epoch: 0, d_loss= 2.205419, g_loss= 31.352011, D(X)= 0.381197, D(G(X))= 0.543290
Epoch: 0, d_loss= 1.640341, g_loss= 30.674690, D(X)= 0.512067, D(G(X))= 0.349348
Epoch: 0, d_loss= 1.334773, g_loss= 34.140278, D(X)= 0.618809, D(G(X))= 0.341588
Traceback (most recent call last):
File "runtime.py", line 43, in
trainer.train(args.cls)
File "/home/jurh/disk2/liupeng/text2image_3/Text-to-Image-Synthesis/trainer.py", line 67, in train
self._train_gan(cls)
File "/home/jurh/disk2/liupeng/text2image_3/Text-to-Image-Synthesis/trainer.py", line 177, in _train_gan for sample in self.data_loader:
File "/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.py", line 187, in next
return self._process_next_batch(batch)
File "/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.py", line 221, in _process_next_batch raise batch.exc_type(batch.exc_msg)
TypeError: function takes exactly 5 arguments (1 given)

Generator diverges?

Hi,

I trained with gan_cls (not the vanilla but conditioned version) on flowers, with the shared hdf5 file, and I got curves of https://drive.google.com/open?id=1pASanOh9YUdg__I5OPRi_srmu3T2JYx8.

The discriminator loss keeps going down (then almost converge) to 0.483, but generator loss keeps up (not converge) to 18.58, and D(X) to 0.846, and D(G(X)) to 0.016. And I got similar image results on prediction as reported.

I think the curves I got during training suggests divergence, correct? If converge, we should see the generator loss also going down, and D(G(X)) up, correct?

Am I missing anything here?
And do you have any suggestions to make the training converge? (I see that you've implemented many tricks from https://github.com/soumith/ganhacks#13-add-noise-to-inputs-decay-over-time. I'm trying #2 to flip real_label with fake_label for generator, doesn't seem help though).

Look forward to your answer. Thanks!

New dataset

Can someone provide guidance on how to generate hdf5 file for COCO dataset.
How do I get the following?
birds_images_path: '/export/mlrg/aelnouby/projects/GANs/Birds dataset/CUB_200_2011/CUB_200_2011/images/'
birds_embedding_path: '/export/mlrg/aelnouby/projects/GANs/Birds dataset/cub_icml/'
birds_text_path: '/export/mlrg/aelnouby/projects/GANs/Birds dataset/cvpr2016_cub/text_c10/'

val_split_path: '/export/mlrg/aelnouby/projects/GANs/Birds dataset/cub_icml/valclasses.txt'
train_split_path: '/export/mlrg/aelnouby/projects/GANs/Birds dataset/cub_icml/trainclasses.txt'
test_split_path: '/export/mlrg/aelnouby/projects/GANs/Birds dataset/cub_icml/testclasses.txt'

There was some reference to use https://github.com/reedscot/cvpr2016, but when I checked the code, it didnt provide much insight. Any help would be appreciated

Duplicate embeddings in convert_cub_to_hd5_script.py

As I was reading convert_cub_to_hd5_script.py to get a better understanding of the .hdf5 structure, I came across this line:

txt_choice = np.random.choice(range(10), 5)

where five text embeddings are supposedly randomly sampled. However, by default, np.random.choice does sampling with replacement, which in this case could cause the same embedding to be picked twice. Would this be an issue?

if you want to use gradient penalty, you have to build from source?

Hi, I want to use gradient penalty since i wana try improved wgan.
what does build from source mean?
it means I should train it from scratch or build pytorch souce code?

thanks.

Release models for Flowers, Birds and COCO datasets

Hi,

Can you please release pre-trained models for all 3 datasets ?

Training results in bird dataset is not good as flowers.

Hi!When I trained the bird dataset,I found the results is not good.Have you trained the bird dataset and how about the result? or do you have any advice to train the birds?
Hope for reply!

Cannot download HDF5 file

Thank you for your code, when I try to run your code, I want to test with HDF5 file of Bird and Flower but the download link was invalid. Can you update the download link?

Thanks!

inception scores?

can you report the inception scores of the modified model?

How to pass the text as input to the generator model and generate images?

Implement Feature matching

Need advice

Hi,

We trained our model for 200 epochs.
Ran predict as follows
python runtime.py --inference --split 2
Images got generated in 'results' folder
But all the images generated were little off - please see below:

Can you please tell us what could have gone wrong?

text embeddings preparation

I want to train with my own data and to prepare my dataset and for that I'll have to convert text files to .t7 format for text embeddings. How do i achieve the same?

The loss of generator

Hello, author. I want to reproduce the experimental results. When I trained the model, I found that the loss of generator was always high and it looked like the generator was much weaker than the discriminator, what should I do?

How long do I need to train？

why I don’t have any results after running the runtime for a long time. No errors or output

Error on visualize.py

Hello I really want to use this repository for my deep learning course. After downloaded datasets when I want to run 'runtime.py' on google colab(I don't have GPU to run on my system), some error happened. The error is related to 'visualize.py'. Error is :
I really need this code please help me to solve it.

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py", line 354, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.7/http/client.py", line 1281, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.7/http/client.py", line 1327, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.7/http/client.py", line 1276, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.7/http/client.py", line 1036, in _send_output
self.send(msg)
File "/usr/lib/python3.7/http/client.py", line 976, in send
self.connect()
File "/usr/local/lib/python3.7/dist-packages/urllib3/connection.py", line 181, in connect
conn = self._new_conn()
File "/usr/local/lib/python3.7/dist-packages/urllib3/connection.py", line 168, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fcf91fe6c50>: Failed to establish a new connection: [Errno 111] Connection refused