I'm Kamyar
I am a coder for life with expertise in C++, C#, Java, Python, JavaScript, TypeScript, and most recently with a focus on Cloud Architecture, Deep Learning in Computer Vision, and Generative Models.
EdgeConnect: Structure Guided Image Inpainting using Edge Prediction, ICCV 2019 https://arxiv.org/abs/1901.00212
License: Other
Hi Knazeri,
I am trying out your code, it's just that no matter for place2 or celeba dataset in examples folder, the training procedure ends in one epoch.
Just like what says on the screenshot. And of course, no models were saved in /checkpoints folder.
Have you run into this problem before?
Really appreciate your help!
Hi,
Thanks a lot for the brilliant work. :)
I just wonder why the default weight of style loss is 1. I noticed that in the paper you are using 250 as style loss weight.
So if I want to train a model on my own dataset, should I set the value to 250?
I want to train the model by ImageNet dataset. I organize these training images path into ImageNet_train.flist,and I set the MODEL=1, MASK=1. I have the follow bug when load the dataset, but I test the size of inputed images as follows. Is it necessary to define the collate_fn function in the DataLoader? Could you help me? Thank you very much!
Hi,
I can not use multi GPUs to train the model... there are no DataParallel called in codes.
Do you have plan to support multi-gpu training?
Thank you.
Thank you for your excellent work. I encountered two problems:
First, when I was training on the Places dataset, when the training went to 1000 iter
s, the model was not saved, and the training did not continue to go down any more.
The second problem is that when I use my own data set, there are some problems, like the picture loading is not successful, but my flist file is generated correctly and the path of the picture is correct.This error will not occur when using the data set in your paper at that time.
I hope you can help me.
Hi @knazeri ,
How to repeat Figure 10 results in the paper?
I think I need to create an edge map and test with --model=2
and --edge=...
Could you give me an example? Thanks.
Hi Kamyar,
I am wondering if it is possible to use the trained model by you as pre-trained model and continue to train with my data. if so, how to update the pre-trained model? Is it same as the training part in your instruction document? Because now I still can't tackle that 1000 iterations problem. (it just got stuck there and would not go beyond the 1001st iteration. I'm thinking that using your existing model and then use my data to continue to train for 999 iterations may work better).
Training epoch: 1
Traceback (most recent call last):
File "train.py", line 2, in
main(mode=1)
File "/code/edge-connect-master/main.py", line 56, in main
model.train()
File "/code/edge-connect-master/src/edge_connect.py", line 179, in train
self.edge_model.backward(e_gen_loss, e_dis_loss)
File "/code/edge-connect-master/src/models.py", line 152, in backward
gen_loss.backward()
File "/usr/local/lib/python3.5/dist-packages/torch/tensor.py", line 102, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/usr/local/lib/python3.5/dist-packages/torch/autograd/init.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
in other model , it does not have this error
Hello, I don't understand how to generate an edge map for inputting G1 when training the G1.
hi, I am new in deep learning and Pytorch. so i have some difficult in reading code. So I have some questions.
in edgeconnect.py
after load the edgeconnect (“self.train_dataset = Dataset(config, config.TRAIN_FLIST, config.TRAIN_EDGE_FLIST,config.TRAIN_MASK_FLIST,augment=True, training=True)”,self.train_dataset will have train and edge and mask files' path)
but then it begin training,and i see that ,in after“ for items in train_loader”,it will become four kinds of data(images, images_gray, edges, masks)?
Can you sovle my problems? Thanks!
hello!
There are 'img_celeba.7z' 'img_align_celeba_png.7z' 'img_align_celeba.zip',which did you use?
In the testing phase, I found a new cryptic bug that the edge output size of EdgeModel
will dismatch with the input image size.
For instance, sizes of input images and masks are both [256,265]
, but handled by the EdgeModel
, the edges output size will turn into [256,264]
, which I guess the odd number 265
is the key of problem. Maybe we can add a resize()
in the test()
function.
images size is torch.Size([1, 3, 256, 265]),
edges size is torch.Size([1, 1, 256, 265]),
masks size is torch.Size([1, 1, 256, 265])
images_masked size torch.Size([1, 3, 256, 265])
edges size after `EdgeModel` is torch.Size([1, 1, 256, 264])
Setting GPU ids in "main.py" has no any effectiveness on choosing the GPU,while setting it in "edge_connect.py" can be effecitve.
Thank you for the great work and the inspiration for inpainting research! 👍
I've been experimenting with you model on imagenet and I can basically confirm that your model is perfect for inpainting images in which edges are crucial (e.g. buildings) but sometimes do poorly on general scenes such as examples provided in paper (page 8, Figure 9).
Q1: Do you think that training/finetuning on ImageNet could resolve this issue to some extent? Have you tried other approaches to resolve this issue (e.g. modified adversarial loss, additional discriminator, modified edge detection etc)?
Since your model is applicable to higher resolutions I tried to continue training on 512x512 using your pretrained model.
Q2: Do you think that this approach is valid or it is better to train whole model from scratch.
Q3: I'm curious about your thoughts about using edge-connect in higher resolutions and bigger masks. Do you have any suggestions for architectural modifications or can think of any other steps that probably should be taken besides provided code?
Thanks!
Line 31 in 552d260
Since you use InstanceNorm, so init BatchNorm is meaningless.
Line 36 in 698509d
Hello, I am wondering if there should be a minus sign on the formula of the discriminator's hinge loss?
According to https://arxiv.org/pdf/1802.05957.pdf equation (17),
V_D = E[min(0, -1+D(real_data))] + E[min(0, -1-D(G(z)))]
But in here, it's
relu(1 + outputs).mean() = max(0, 1 + outputs).mean() ~= E[max(0, 1-D(real_data))] + E[max(0, 1+D(G(z)))]
I guess there a minus sign should be added to achieve the same behaviour?
Hi @knazeri ,
I have a question that, have you ever tried any learning rate policy in image inpainting model training?
Such as cosine decay, step policy etc.
Do you think these policy can help better convergence in such models' training?
Besides, thank you for your contributions! I have benefited a lot from this repo!
hello @knazeri
After I have finished training edge and inpaint model, I train the inpaint with edge model as your paper said.But the intermediate results are strange, what is the problem that causes the output to be blank?
Below is one of the sample images:
when i run it with python3 train.py --model 1 --checkpoints ./checkpoints/places2/
it has an error occur loading error: ./datasets/irregular_mask/disocclusion_img_mask/36173.png
i am true the image in this path so i check the code and find it need data, edge_data, and mask_data, but in Places2 we have only one? where the data i can get? thank you very much
file:///home/skj/%E6%A1%8C%E9%9D%A2/2019-02-27%2012-32-52%E5%B1%8F%E5%B9%95%E6%88%AA%E5%9B%BE.png I have encountered a dataset loading error, can you help?
Thank you for your excellent work. In your paper, you introduced how to get 256x256 images from original
images on celebA and Paris StreetView. but for places2, how do you get the 256x256 images?
Hi @knazeri ,
Just test with places2 pre-trained models and model 4. But the results are not very good.
Did I do something wrong? Or how can I improve the performance? Thanks!
Image
Mask
Result
Hello, I meet some trouble when training the edge model as follows use the code parameters('LR':0.0001, 'D2G_LR':0.1), and I dont' understand the difference of LR and D2G_LR.
And the gen_loss has been turbulent.
For the edge image, the above is the output of edge model, and the down is the groundtruth.
I think the D2G_LR is too big, could you help me?
thanks for your work!I have some questions about the code in edge-connect.py. In your
coder, the input=(images * (1 - masks)) + masks, that means the white pixel in the irregular mask is the holes,that is different from Image Inpainting for Irregular Holes Using Partial Convolutions. I observed the most irregular masks have large white regions that result in most inputs only a little regions remains,and the edge predict are not perfect.
I think may be change the input=(images * (1 - masks)) + masks to input=(images * masks + masks
is better?
Sorry to trouble you,could you give me the link of Paris Street-View Dataset?
The author did not respond to me.
My e-mail is [email protected]
Thank you!
I want to train the model for my own dataset,and how can I make the mask file list,edges files list.Could your help me ?Thanks!
In training phase, the call of model.train()
is missed after the call of sample()
and eval()
.
model.train()
is only called at the start of each epoch.
edge-connect/src/edge_connect.py
Line 93 in 97338a2
edge-connect/src/edge_connect.py
Line 94 in 97338a2
sample()
is called, the following training iterations are all under eval()
model! Though there are no dropout
or batchNorm
layers in the network, the call of train()
is still necessary to avoid some confusing bugs.
And I strongly recommend use with torch.no_grad()
to wrap the validation or test phase, such as sample()
, eval()
or test()
functions here, in order to save memory and speed up. Refer to discuss about no_grad()
Hi @knazeri ,
Thank you for your contributions to this field!
My concern is that it takes too long(2-3 days) to train a model on celcebA/psv.
I wonder, whether pretraining on a larger dataset (eg., Places2), then fine-tune on celebA/psv can speed up training as other filed usually do (eg., classification)?
But the image distribution of CelebA (human face) is very different from Places2 (natural images) .
Thanks,
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_loc
ation='cpu' to map your storages to the CPU.
src/model.py
line 26 data = torch.load(self.gen_weights_path)
change
data = torch.load(self.gen_weights_path, map_location=lambda storage, loc: storage)
Loading EdgeModel generator...
Loading InpaintingModel generator...
start testing...
1 places2_01.png
2 places2_02.png
3 places2_03.png
4 places2_04.png
5 places2_05.png
End test....
OK
when I run the command
python ./scripts/fid_score.py --path /userhome/inpaint_bord/data/places2_gt_1000/ /userhome/edge-connect-master/checkpoints/results/ (the /userhome/inpaint_bord/data/places2_gt_1000/ contains 1000 really images and /userhome/edge-connect-master/checkpoints/results/ contains 1000 inpainted images), the process is seized up and stopped.
the log is like :
calculate path1 statistics...
calculate path2 statistics...
./scripts/fid_score.py:86: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
batch = Variable(batch, volatile=True)
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2351: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2423: UserWarning: Default upsampling behavior when mode=bil inear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
is it used correct to test fid? and if i want to test fid about celebA faces, is it also use the inception model trained on imagenet or retrain the model on celebA faces?
Hello!
I have some questions.When I use your examples, I can achieve the same effect.But when I use my own images.There are some problems.Can you tell me that how can I get the corresponding mask images
Hi Knazeri, the code of getting images_masked
is as follows:
images_masked = (images * (1 - masks).float()) + masks
images_masked = (images * (1 - masks).float()) + masks
the hole areas in the image are filled with 1
. I also read some other inpainting code and find that some code filled holes with 0
. Is the inpainting result affected by filling hole areas with 1
or 0
?
Thanks for your interesting work for image inpainting! (star)
I have a few questions about the discriminator:
What's the pros and cons for the three GAN loss functions (nsgan, lsgan and hinge) for edge-connect? Is there any visual/numerical comparison or reason to use them?
The default GAN loss is nsgan. Are the pretrained models all trained with nsgan?
I like your implementation of the adversarial loss
Lines 31 to 43 in 698509d
if the model 4 (joint model) is to finetune the model 1(edge model) and model 2(inpaint model) ,what is model 3 (inpainting model ) for? if i dont want to train the edge model , should i train the model 3 directly?
Hi @knazeri ,
Would you like to provide the dataset splits (train/val/test) for celeba/psv/places2 used in the paper?
It makes me confused that (section B of the supplemental material), do you use all the images for training? For example, 14900 psv images include test data.
Thanks,
yanhong.
Hi.
Thanks for your great work.
I wonder which GPU did u use,and how many time you spend on training each dataset?
Do u have any suggestion on the training sequence of datasets? Should i start with the places2 or others?And why?
THX!
Why do I get the following error when I use the pre-trained model to test the image you provided?
RuntimeError: Error(s) in loading state_dict for EdgeGenerator:
Missing key(s) in state_dict: "encoder.1.weight", "encoder.4.weight", "encoder.7.weight", "middle.0.conv_block.1.weight", "middle.0.conv_block.5.weight", "middle.1.conv_block.1.weight", "middle.1.conv_block.5.weight", "middle.2.conv_block.1.weight", "middle.2.conv_block.5.weight", "middle.3.conv_block.1.weight", "middle.3.conv_block.5.weight", "middle.4.conv_block.1.weight", "middle.4.conv_block.5.weight", "middle.5.conv_block.1.weight", "middle.5.conv_block.5.weight", "middle.6.conv_block.1.weight", "middle.6.conv_block.5.weight", "middle.7.conv_block.1.weight", "middle.7.conv_block.5.weight", "decoder.0.weight", "decoder.3.weight".
Unexpected key(s) in state_dict: "encoder.1.weight_v", "encoder.4.weight_v", "encoder.7.weight_v", "middle.0.conv_block.1.weight_v", "middle.0.conv_block.5.weight_v", "middle.1.conv_block.1.weight_v", "middle.1.conv_block.5.weight_v", "middle.2.conv_block.1.weight_v", "middle.2.conv_block.5.weight_v", "middle.3.conv_block.1.weight_v", "middle.3.conv_block.5.weight_v", "middle.4.conv_block.1.weight_v", "middle.4.conv_block.5.weight_v", "middle.5.conv_block.1.weight_v", "middle.5.conv_block.5.weight_v", "middle.6.conv_block.1.weight_v", "middle.6.conv_block.5.weight_v", "middle.7.conv_block.1.weight_v", "middle.7.conv_block.5.weight_v", "decoder.0.weight_v", "decoder.3.weight_v".
Hi @knazeri ,
To train the model, should I have to train edge model, inpaint model and joint model in sequence?
Can I train the models at the same time?
Thanks for your time!
Hello @knazeri , thanks for your code!
In your paper, some inpainting results are shown. They are really cool, but the results in the pictures have different style. That is, some results are nearly same with the original pic while others not (but still visually perfect). For example, in Fig 15, the following results:
In all of the three images, eyes (or mouth) are missing. For the first image, it is recovered with a synthesized person, while for the 2-3 lines, the recovered images are actually the same as the original image.
My questions are:
Q1: Are the images selected from a test (validation) set, which is not used for training? Or only mask is seperated into training/val/test?
Q2: Are there any overlapping person between the train/val/test set? Intuitively, the results in the above 2-3 lines is due to the existing of the same person in the train/test set.
Thanks!
could you also share the pretrained models with fixed bbox masks of the three datasets?
Hi @knazeri ,
Thanks for your great job!
I have two question about canny edge detections.
I have downloaded the irregular mask dataset from http://masc.cs.gmu.edu/wiki/partialconv, but I found the region of mask is flaged by black.The mask'color in paper is white, whether do I need to convert it?
When I am trying test example in Getting Started -> 2) Testing, I found that the command
python test.py \
--checkpoints ./checkpoints/places2
--input ./examples/places2/images
--mask ./examples/places2/mask
--output ./checkpoints/results
raise an error when reading the mask image, after checking the paths in this command, the --mask
directory should be './examples/places2/masks', everything runs well after modification.
Hi, thanks you for your amazing work and contribution to the field.
I'm very interested in your work and I'm currently trying to reproduce the result in places2 dataset.
I noticed that there exists 2 parts in places2 which is the standard one and the places2 challenge one.
So I would like to know which part(s) did you use for training your model. :)
Line 99 in 552d260
torch.nn.utils.spectral_norm(module, name='weight', n_power_iterations=1, eps=1e-12, dim=None)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.