leongatys / pytorchneuralstyletransfer Goto Github PK

View Code? Open in Web Editor NEW

422.0 422.0 103.0 2.37 MB

Implementation of Neural Style Transfer in Pytorch

License: MIT License

Jupyter Notebook 99.99% Shell 0.01%

pytorchneuralstyletransfer's People

Contributors

Stargazers

Watchers

Forkers

wanjinchang dingjianfei yobajnin captain13055 ml-lab fazlerabbitanjil johndpope shubhampachori12110095 nipandha borisfom togheppi alikhalilli pinglmlcv allenwoods flt19940317 afcarl xiaer1 tang1485 danhnq7 jhlegarreta zhaoyang626 gabeochieng mikigom calvinlcchen linda-liu fdan sebagm10 pandinosaurus maswoodashaik landeraxe rj8228 biubiulsm tongjil linhduongtuan hyekang lhy719 estojoverde valerapon rishabhpatil dnrocha njuhuojing yousa2298 taochunliang bijiazhou myeongjin-kim deatherider mkolodny marns nathanlct rgga-16 irfanazhar myagues benjikershenbaum xengpro patafisico jmihali williamyang1991 ipsitmantri 1304692 qap albertpi-git anas-alamri janderer semabali c00renut pilotbear pmeier ason93 jinshiyin birmilyarkirpi freegliboracle laaraa houlin picklehatter ioannoue wyfunique lukashoel wormpartner ahmedgarip muftawoomar niko2756 fighterbet billxzy1215 hshallucinations firewings2515 mbrukman jaoh scoobaroo maatkara king-gizzard xdlcf soiris11 jem-kaybis liuzc-ustc abhiupes01 5l1v3r1 easycelsius mahui0603 chrisgloeckel vexorz

pytorchneuralstyletransfer's Issues

imshow method is missing

imshow method is missing.
i may write and submit to you.

vgg model no longer available

This model is no longer available:

http://bethgelab.org/media/uploads/pytorch_models/vgg_conv.pth

I have a local copy if it's useful to re-host somewhere.

Please add any kind of license

MIT or Apache or even BSD. Or of course anything else, it's your code after all. :P Just anything to get a handle on this because as it stands, this code is "look don't touch", which is really a shame. :)

Runtime error when moving vgg to FP16

This is indeed a pytorch issue, not the issue of the original code by Leon.

I am trying to speed up the neural style transfer on Nvidia Tesla V100 by using FP16.
I modified the code to move the vgg to cuda().half(). In addition, all three images, style image, content image, and opt_img, are in FP16. I tried to keep the loss layers in FP32 because it easily can generate NaN and infinity in FP16.
The code is at https://gist.github.com/michaelhuang74/009e149a2002b84696731fb599408c90

When I ran the code, I encountered the following error.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Traceback (most recent call last):
File "neural-style-Gatys-half.py", line 167, in
style_targets = [GramMatrix()(A).detach().cuda() for A in vgg(style_image, style_layers)]
File "/home/mqhuang2/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 319, in call
result = self.forward(input, **kwargs)
File "neural-style-Gatys-half.py", line 86, in forward
G.div_(hw)
RuntimeError: value cannot be converted to type Half without overflow: 960000
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

It seems that although I tried to keep the GramMatrix and loss functions in FP32, somehow, pytorch tried to convert FP32 to FP16 in the GramMatrix forward() method.

Any idea how to resolve this error?

Loss value not decreasing with LBFGS

I run the same code, with the output initialized to the content image. When running the optimization with LBFGS, the loss values does not decrease.

Iteration: 50, loss: 479620896.000000
Iteration: 100, loss: 479620896.000000
....

There are no updates to opt_img at all. Is there any reason that this could be happening?
EDIT: There is an exploding gradient as well. I am wondering if there is any clamping that is required.

How exactly does the Gram matrix get the style of the image?

And what is the reason that the Gram matrix works so well for style transfer?

loss = sum(layer_losses)

Hello, I am new to pytorch and this scene in general, but when I run the code I get an error;torch.sum() Use tensor.detach().numpy() instead.

I am not familiar with how to detach or couldnt find an example of detaching each tensor on the list.
Is the following equivalent code?
:
for elem in layer_losses[1:]:
loss += elem

Imagenet normalization

The pytorch docs (link) say to normalize images via

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

However, the notebook in this repo normalizes via

 transforms.Normalize(mean=[0.40760392, 0.45795686, 0.48501961],
                                                std=[1,1,1]

I am trying to re-create the results from the original paper, so I am just curious about this. Is this method of normalization specific to this task, did the imagenet normalizations for pytorch change over time, or is there some other reason I may be missing?

Why cov4 layer is repeated 2 times in VGG module ?

I am a beginner in ANN and I am working on your paper on style transfer.

But I have a question regarding defining VGG module. Why there is layer4 is repeated 2 times? (Check below code)
Thanks in advance.

    self.conv4_1 = nn.Conv2d(256, 512, kernel_size=3, padding=1)
      self.conv4_2 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_3 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_4 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_1 = nn.Conv2d(256, 512, kernel_size=3, padding=1)
      self.conv4_2 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_3 = nn.Conv2d(512, 512, kernel_size=3, padding=1)
      self.conv4_4 = nn.Conv2d(512, 512, kernel_size=3, padding=1)

Hello, could you help me？

May I know how to obtain style reconstruction and content reconstruction？Thanks！

How the "vgg_conv.pth" is generated?

I found that "vgg_conv.pth" only has the conv layers of the original vgg net, and the fc layers are removed. May I ask that how this model is generated?

Confusion in gram matrix normalization division step's denominator values

In GramMatrix class, at 3rd cell and 7th line, shouldn't we divide the gram matrix by the product of channels, height, and width (i.e. G.div_(c*h*w) instead of G.div_(h*w))? Or am I missing something?

Edit: I do notice better results with your gram matrix implementation but not sure how this is a correct normalization. 🤔

Link for downloading the model is unavailable

Hi,
The link for download : https://bethgelab.org/media/uploads/pytorch_models/vgg_conv.pth is currently unavailable. In fact, I think the whole webpage http://bethgelab.org/ itself is unavailable.

Could you kindly solve the issue, or is there another way to download it?

Thanks!

Are the weights the normalized ones from the paper?

Runtime Error: Mismatch of size of tensor

Hi, I have been trying to debug the whole day, but have yet to solve the problem that only occurred to some images.

Below is the error message:

Traceback (most recent call last):
  File "NeuralStyleTransfer.py", line 279, in <module>
    optimizer.step(closure)
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/optim/lbfgs.py", line 103, in step
    orig_loss = closure()
  File "NeuralStyleTransfer.py", line 268, in closure
    layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a, A in enumerate(out)]
  File "NeuralStyleTransfer.py", line 268, in <listcomp>
    layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a, A in enumerate(out)]
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 443, in forward
    return F.mse_loss(input, target, reduction=self.reduction)
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/nn/functional.py", line 2256, in mse_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "/home/user/miniconda3/envs/gatys/lib/python3.6/site-packages/torch/functional.py", line 62, in broadcast_tensors
    return torch._C._VariableFunctions.broadcast_tensors(tensors)
RuntimeError: The size of tensor a (159) must match the size of tensor b (160) at non-singleton dimension 3

Really appreciate for any assistance.

division by zero error on L-BGFS / python3.6 no cuda

Thanks for the code Leon, cloning and running on linux, python 3.6 with cuda, works just fine for source images or any other img i threw at this.
Running on py3.6 no cuda on a mac I got a division by zero on the optimizer
Not that it makes any sense to run this without cuda but thought you ought to know.
Running it with RMSProp this worked - although results were less pronounced.

ZeroDivisionError Traceback (most recent call last)
in ()
20 return loss
21
---> 22 optimizer.step(closure)
23
24 #display result

~/anaconda3/envs/abc/lib/python3.6/site-packages/torch/optim/lbfgs.py in step(self, closure)
151
152 # update scale of initial Hessian approximation
--> 153 H_diag = ys / y.dot(y) # (y*y)
154
155 # compute the approximate (L-BFGS) inverse Hessian

ZeroDivisionError: float division by zero

Link for downloading vgg unavailable

Hi,

The link https://bethgelab.org/media/uploads/pytorch_models/vgg_conv.pth from the shell script is currently unavailable. Is there any other way to download it?

Thank you in advance!