leongatys / pytorchneuralstyletransfer Goto Github PK
View Code? Open in Web Editor NEWImplementation of Neural Style Transfer in Pytorch
License: MIT License
Implementation of Neural Style Transfer in Pytorch
License: MIT License
imshow method is missing.
i may write and submit to you.
This model is no longer available:
http://bethgelab.org/media/uploads/pytorch_models/vgg_conv.pth
I have a local copy if it's useful to re-host somewhere.
MIT or Apache or even BSD. Or of course anything else, it's your code after all. :P Just anything to get a handle on this because as it stands, this code is "look don't touch", which is really a shame. :)
This is indeed a pytorch issue, not the issue of the original code by Leon.
I am trying to speed up the neural style transfer on Nvidia Tesla V100 by using FP16.
I modified the code to move the vgg to cuda().half(). In addition, all three images, style image, content image, and opt_img, are in FP16. I tried to keep the loss layers in FP32 because it easily can generate NaN and infinity in FP16.
The code is at https://gist.github.com/michaelhuang74/009e149a2002b84696731fb599408c90
When I ran the code, I encountered the following error.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Traceback (most recent call last):
File "neural-style-Gatys-half.py", line 167, in
style_targets = [GramMatrix()(A).detach().cuda() for A in vgg(style_image, style_layers)]
File "/home/mqhuang2/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 319, in call
result = self.forward(input, **kwargs)
File "neural-style-Gatys-half.py", line 86, in forward
G.div_(hw)
RuntimeError: value cannot be converted to type Half without overflow: 960000
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
It seems that although I tried to keep the GramMatrix and loss functions in FP32, somehow, pytorch tried to convert FP32 to FP16 in the GramMatrix forward() method.
Any idea how to resolve this error?
I run the same code, with the output initialized to the content image. When running the optimization with LBFGS, the loss values does not decrease.
Iteration: 50, loss: 479620896.000000
Iteration: 100, loss: 479620896.000000
....
There are no updates to opt_img at all. Is there any reason that this could be happening?
EDIT: There is an exploding gradient as well. I am wondering if there is any clamping that is required.
And what is the reason that the Gram matrix works so well for style transfer?
Hello, I am new to pytorch and this scene in general, but when I run the code I get an error;torch.sum() Use tensor.detach().numpy() instead.
I am not familiar with how to detach or couldnt find an example of detaching each tensor on the list.
Is the following equivalent code?
:
for elem in layer_losses[1:]:
loss += elem
The pytorch docs (link) say to normalize images via
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
However, the notebook in this repo normalizes via
transforms.Normalize(mean=[0.40760392, 0.45795686, 0.48501961],
std=[1,1,1]
I am trying to re-create the results from the original paper, so I am just curious about this. Is this method of normalization specific to this task, did the imagenet normalizations for pytorch change over time, or is there some other reason I may be missing?
I am a beginner in ANN and I am working on your paper on style transfer.
But I have a question regarding defining VGG module. Why there is layer4 is repeated 2 times? (Check below code)
Thanks in advance.
self.conv4_1 = nn.Conv2d(256, 512, kernel_size=3, padding=1)
self.conv4_2 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
self.conv4_3 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
self.conv4_4 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
self.conv4_1 = nn.Conv2d(256, 512, kernel_size=3, padding=1)
self.conv4_2 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
self.conv4_3 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
self.conv4_4 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
May I know how to obtain style reconstruction and content reconstruction?Thanks!
I found that "vgg_conv.pth" only has the conv layers of the original vgg net, and the fc layers are removed. May I ask that how this model is generated?
In GramMatrix
class, at 3rd cell and 7th line, shouldn't we divide the gram matrix by the product of channels, height, and width (i.e. G.div_(c*h*w)
instead of G.div_(h*w)
)? Or am I missing something?
Edit: I do notice better results with your gram matrix implementation but not sure how this is a correct normalization. 🤔
Hi,
The link for download : https://bethgelab.org/media/uploads/pytorch_models/vgg_conv.pth is currently unavailable. In fact, I think the whole webpage http://bethgelab.org/ itself is unavailable.
Could you kindly solve the issue, or is there another way to download it?
Thanks!
Hi, I have been trying to debug the whole day, but have yet to solve the problem that only occurred to some images.
Below is the error message:
Traceback (most recent call last):
File "NeuralStyleTransfer.py", line 279, in <module>
optimizer.step(closure)
File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/optim/lbfgs.py", line 103, in step
orig_loss = closure()
File "NeuralStyleTransfer.py", line 268, in closure
layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a, A in enumerate(out)]
File "NeuralStyleTransfer.py", line 268, in <listcomp>
layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a, A in enumerate(out)]
File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 443, in forward
return F.mse_loss(input, target, reduction=self.reduction)
File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/functional.py", line 2256, in mse_loss
expanded_input, expanded_target = torch.broadcast_tensors(input, target)
File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/functional.py", line 62, in broadcast_tensors
return torch._C._VariableFunctions.broadcast_tensors(tensors)
RuntimeError: The size of tensor a (159) must match the size of tensor b (160) at non-singleton dimension 3
Really appreciate for any assistance.
ZeroDivisionError Traceback (most recent call last)
in ()
20 return loss
21
---> 22 optimizer.step(closure)
23
24 #display result
~/anaconda3/envs/abc/lib/python3.6/site-packages/torch/optim/lbfgs.py in step(self, closure)
151
152 # update scale of initial Hessian approximation
--> 153 H_diag = ys / y.dot(y) # (y*y)
154
155 # compute the approximate (L-BFGS) inverse Hessian
ZeroDivisionError: float division by zero
Hi,
The link https://bethgelab.org/media/uploads/pytorch_models/vgg_conv.pth from the shell script is currently unavailable. Is there any other way to download it?
Thank you in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.