progamergov / neural-style-pt Goto Github PK

PyTorch implementation of neural style transfer algorithm

License: MIT License

Python 100.00%

neural-style pytorch style-transfer deep-learning neural-style-pt neural-style-transfer nst styletransfer pytorch-style-transfer deep-style

neural-style-pt's Introduction

neural-style-pt

This is a PyTorch implementation of the paper A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. The code is based on Justin Johnson's Neural-Style.

The paper presents an algorithm for combining the content of one image with the style of another image using convolutional neural networks. Here's an example that maps the artistic style of The Starry Night onto a night-time photograph of the Stanford campus:

Applying the style of different images to the same content image gives interesting results. Here we reproduce Figure 2 from the paper, which renders a photograph of the Tubingen in Germany in a variety of styles:

Here are the results of applying the style of various pieces of artwork to this photograph of the golden gate bridge:

Content / Style Tradeoff

The algorithm allows the user to trade-off the relative weight of the style and content reconstruction terms, as shown in this example where we port the style of Picasso's 1907 self-portrait onto Brad Pitt:

Style Scale

By resizing the style image before extracting style features, we can control the types of artistic features that are transfered from the style image; you can control this behavior with the -style_scale flag. Below we see three examples of rendering the Golden Gate Bridge in the style of The Starry Night. From left to right, -style_scale is 2.0, 1.0, and 0.5.

Multiple Style Images

You can use more than one style image to blend multiple artistic styles.

Clockwise from upper left: "The Starry Night" + "The Scream", "The Scream" + "Composition VII", "Seated Nude" + "Composition VII", and "Seated Nude" + "The Starry Night"

Style Interpolation

When using multiple style images, you can control the degree to which they are blended:

Transfer style but not color

If you add the flag -original_colors 1 then the output image will retain the colors of the original image.

Setup:

Dependencies:

PyTorch

Optional dependencies:

For CUDA backend:
- CUDA 7.5 or above
For cuDNN backend:
- cuDNN v6 or above
For ROCm backend:
- ROCm 2.1 or above
For MKL backend:
- MKL 2019 or above
For OpenMP backend:
- OpenMP 5.0 or above

After installing the dependencies, you'll need to run the following script to download the VGG model:

python models/download_models.py

This will download the original VGG-19 model. The original VGG-16 model will also be downloaded. By default the original VGG-19 model is used.

If you have a smaller memory GPU then using NIN Imagenet model will be better and gives slightly worse yet comparable results. You can get the details on the model from BVLC Caffe ModelZoo. The NIN model is downloaded when you run the download_models.py script.

You can find detailed installation instructions for Ubuntu and Windows in the installation guide.

Usage

Basic usage:

python neural_style.py -style_image <image.jpg> -content_image <image.jpg>

cuDNN usage with NIN Model:

python neural_style.py -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet.pth -gpu 0 -backend cudnn -num_iterations 1000 -seed 123 -content_layers relu0,relu3,relu7,relu12 -style_layers relu0,relu3,relu7,relu12 -content_weight 10 -style_weight 500 -image_size 512 -optimizer adam

To use multiple style images, pass a comma-separated list like this:

-style_image starry_night.jpg,the_scream.jpg.

Note that paths to images should not contain the ~ character to represent your home directory; you should instead use a relative path or a full absolute path.

Options:

-image_size: Maximum side length (in pixels) of the generated image. Default is 512.
-style_blend_weights: The weight for blending the style of multiple style images, as a comma-separated list, such as -style_blend_weights 3,7. By default all style images are equally weighted.
-gpu: Zero-indexed ID of the GPU to use; for CPU mode set -gpu to c.

Optimization options:

-content_weight: How much to weight the content reconstruction term. Default is 5e0.
-style_weight: How much to weight the style reconstruction term. Default is 1e2.
-tv_weight: Weight of total-variation (TV) regularization; this helps to smooth the image. Default is 1e-3. Set to 0 to disable TV regularization.
-num_iterations: Default is 1000.
-init: Method for generating the generated image; one of random or image. Default is random which uses a noise initialization as in the paper; image initializes with the content image.
-init_image: Replaces the initialization image with a user specified image.
-optimizer: The optimization algorithm to use; either lbfgs or adam; default is lbfgs. L-BFGS tends to give better results, but uses more memory. Switching to ADAM will reduce memory usage; when using ADAM you will probably need to play with other parameters to get good results, especially the style weight, content weight, and learning rate.
-learning_rate: Learning rate to use with the ADAM optimizer. Default is 1e1.
-normalize_gradients: If this flag is present, style and content gradients from each layer will be L1 normalized.

Output options:

-output_image: Name of the output image. Default is out.png.
-print_iter: Print progress every print_iter iterations. Set to 0 to disable printing.
-save_iter: Save the image every save_iter iterations. Set to 0 to disable saving intermediate results.

Layer options:

-content_layers: Comma-separated list of layer names to use for content reconstruction. Default is relu4_2.
-style_layers: Comma-separated list of layer names to use for style reconstruction. Default is relu1_1,relu2_1,relu3_1,relu4_1,relu5_1.

Other options:

-style_scale: Scale at which to extract features from the style image. Default is 1.0.
-original_colors: If you set this to 1, then the output image will keep the colors of the content image.
-model_file: Path to the .pth file for the VGG Caffe model. Default is the original VGG-19 model; you can also try the original VGG-16 model.
-pooling: The type of pooling layers to use; one of max or avg. Default is max. The VGG-19 models uses max pooling layers, but the paper mentions that replacing these layers with average pooling layers can improve the results. I haven't been able to get good results using average pooling, but the option is here.
-seed: An integer value that you can specify for repeatable results. By default this value is random for each run.
-multidevice_strategy: A comma-separated list of layer indices at which to split the network when using multiple devices. See Multi-GPU scaling for more details.
-backend: nn, cudnn, openmp, or mkl. Default is nn. mkl requires Intel's MKL backend.
-cudnn_autotune: When using the cuDNN backend, pass this flag to use the built-in cuDNN autotuner to select the best convolution algorithms for your architecture. This will make the first iteration a bit slower and can take a bit more memory, but may significantly speed up the cuDNN backend.

Frequently Asked Questions

Problem: The program runs out of memory and dies

Solution: Try reducing the image size: -image_size 256 (or lower). Note that different image sizes will likely require non-default values for -style_weight and -content_weight for optimal results. If you are running on a GPU, you can also try running with -backend cudnn to reduce memory usage.

Problem: -backend cudnn is slower than default NN backend

Solution: Add the flag -cudnn_autotune; this will use the built-in cuDNN autotuner to select the best convolution algorithms.

Problem: Get the following error message:

Missing key(s) in state_dict: "classifier.0.bias", "classifier.0.weight", "classifier.3.bias", "classifier.3.weight". Unexpected key(s) in state_dict: "classifier.1.weight", "classifier.1.bias", "classifier.4.weight", "classifier.4.bias".

Solution: Due to a mix up with layer locations, older models require a fix to be compatible with newer versions of PyTorch. The included donwload_models.py script will automatically perform these fixes after downloading the models.

Memory Usage

By default, neural-style-pt uses the nn backend for convolutions and L-BFGS for optimization. These give good results, but can both use a lot of memory. You can reduce memory usage with the following:

Use cuDNN: Add the flag -backend cudnn to use the cuDNN backend. This will only work in GPU mode.
Use ADAM: Add the flag -optimizer adam to use ADAM instead of L-BFGS. This should significantly reduce memory usage, but may require tuning of other parameters for good results; in particular you should play with the learning rate, content weight, and style weight. This should work in both CPU and GPU modes.
Reduce image size: If the above tricks are not enough, you can reduce the size of the generated image; pass the flag -image_size 256 to generate an image at half the default size.

With the default settings, neural-style-pt uses about 3.7 GB of GPU memory on my system; switching to ADAM and cuDNN reduces the GPU memory footprint to about 1GB.

Speed

Speed can vary a lot depending on the backend and the optimizer. Here are some times for running 500 iterations with -image_size=512 on a Tesla K80 with different settings:

-backend nn -optimizer lbfgs: 117 seconds
-backend nn -optimizer adam: 100 seconds
-backend cudnn -optimizer lbfgs: 124 seconds
-backend cudnn -optimizer adam: 107 seconds
-backend cudnn -cudnn_autotune -optimizer lbfgs: 109 seconds
-backend cudnn -cudnn_autotune -optimizer adam: 91 seconds

Here are the same benchmarks on a GTX 1080:

-backend nn -optimizer lbfgs: 56 seconds
-backend nn -optimizer adam: 38 seconds
-backend cudnn -optimizer lbfgs: 40 seconds
-backend cudnn -optimizer adam: 40 seconds
-backend cudnn -cudnn_autotune -optimizer lbfgs: 23 seconds
-backend cudnn -cudnn_autotune -optimizer adam: 24 seconds

Multi-GPU scaling

You can use multiple CPU and GPU devices to process images at higher resolutions; different layers of the network will be computed on different devices. You can control which GPU and CPU devices are used with the -gpu flag, and you can control how to split layers across devices using the -multidevice_strategy flag.

For example in a server with four GPUs, you can give the flag -gpu 0,1,2,3 to process on GPUs 0, 1, 2, and 3 in that order; by also giving the flag -multidevice_strategy 3,6,12 you indicate that the first two layers should be computed on GPU 0, layers 3 to 5 should be computed on GPU 1, layers 6 to 11 should be computed on GPU 2, and the remaining layers should be computed on GPU 3. You will need to tune the -multidevice_strategy for your setup in order to achieve maximal resolution.

We can achieve very high quality results at high resolution by combining multi-GPU processing with multiscale generation as described in the paper Controlling Perceptual Factors in Neural Style Transfer by Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, Aaron Hertzmann and Eli Shechtman.

Here is a 4016 x 2213 image generated on a server with eight Tesla K80 GPUs:

The script used to generate this image can be found here.

Implementation details

Images are initialized with white noise and optimized using L-BFGS.

We perform style reconstructions using the conv1_1, conv2_1, conv3_1, conv4_1, and conv5_1 layers and content reconstructions using the conv4_2 layer. As in the paper, the five style reconstruction losses have equal weights.

Citation

If you find this code useful for your research, please cite it using the provided citation.

neural-style-pt's People

Contributors

Stargazers

Watchers

Forkers

sj-porter-dev mashiyatz rjanser alexilliamson momo-the-monster pauliver potesd taroorat jaledmc gregrobison qwerdbeta henrytriplette genekogan sstefanowski mfcabrera tongni1975 skanel spot92 bmaltais grimlucis blithedale jh2oman tamccullough sinig maanalmajed johnr0 dahebolangkuan zhixuedu ers81239 likardo mhaboali tariqhamid barddan01 ahmadmuhsink jochemstoel darekgit lxngoddess5321 abhijit-ml nimmen phpeke truematthewkirkham nikhil-swamix tohsin chuckhend nehalgajraj c00renut blackliquid samerhjr ashbt samithaj dungpham98 zyh1234 ngomanhcuong2110 cxz jcdavie ruanjiyang hongyunnchen mbyase barrywang853218409 qqyouhappy lifengss xrosliang haroonshakeel bycloudai learningpro stefanorosss paulhb7 arise-project jeppe-pape bawright86 helixngc7293 dw5 chenxingshensecond ruborcalor petered canasnunes victorfbrito ebranda djaul sibiryoff muftawoomar zhongqianli k9ele7en marksteve suigenk yuxijin dinho78 rahul-sindhu parkbobby vodkavoodooman nineyears-task shubhambagwari dayiart7 thiwankajayasiri eddyyuan sneccc mishav78 winlentia digitalhellow hatimtachi

neural-style-pt's Issues

Memory

I have a 1080 ti with 11 GB of VRAM and I am running into this:

RuntimeError: CUDA out of memory. Tried to allocate 148.00 MiB (GPU 0; 11.00 GiB total capacity; 8.73 GiB already allocated; 44.32 MiB free; 77.81 MiB cached)

I am doing literally nothing else with my computer except styling an image, but it seems that the size I can perform is rather small given my "available" VRAM. I put available in quotes because I know it exists, but I am uncertain about why there is such a low amount that is actually free.

-init_image doesn't work?

-init_image doesn't seem to work, or at least not how I expect it to. The second image here is 100 iterations into optimization after being called with the first image as the -init_image arg. The only change in arguments is a slightly higher -style-weight. It doesn't look any different from how it would if it were initialized randomly.

Tiling

How hard would it be to get tiling? I have been working on doing this, but I am struggling to figure out how to style the tiles that I have from a large image. Trying not to do it one at a time, but some sort of for loop situation.

Content image iteration

Is it possible to iterate though a folder of content images?

Multiple Style Images

Is there some way to change what styles get used where in the final image? I thought that maybe reordering the style images would change the output (same as it works for style layers), but I got the same output each time. I am aware of the different style weightings for multiple input styles, but that does not change the location of the style in the final, more the size/intensity.

normalize_weights

Does normalize_weights work the same as normalize_gradients from jcjohnson? I have seen you posting about setting content_weight to 0 in order to achieve the same effect as normalize_gradients, though. So if this is not the case, what does normalize_weights do?

Error Message caffe2_nvrtc.dll

I keep getting an error message related to this caffe2_nvrtc.dll when I run the command:

neural-style -style_image C:\\Users\\crovn\\Desktop\\Brooklyn_Bridge_Manhattan.jpg -content_image C:\\Users\\crovn\\Desktop\\MonaLisaOriginal.jpg -output_image C:\\Users\\crovn\\Desktop\\profile.png

The error message that I get reads:

Traceback (most recent call last): File "c:\users\crovn\appdata\local\programs\python\python38\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\crovn\appdata\local\programs\python\python38\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\crovn\AppData\Roaming\Python\Python38\Scripts\neural-style.exe\__main__.py", line 4, in <module> File "C:\Users\crovn\AppData\Roaming\Python\Python38\site-packages\neural_style\neural_style.py", line 5, in <module> import torch File "c:\users\crovn\appdata\local\programs\python\python38\lib\site-packages\torch\__init__.py", line 81, in <module> ctypes.CDLL(dll) File "c:\users\crovn\appdata\local\programs\python\python38\lib\ctypes\__init__.py", line 373, in __init__ self._handle = _dlopen(self._name, mode) FileNotFoundError: Could not find module 'c:\users\crovn\appdata\local\programs\python\python38\lib\site-packages\torch\lib\caffe2_nvrtc.dll' (or one of its dependencies). Try using the full path with constructor syntax.

When I go to the file path, c:\users\crovn\appdata\local\programs\python\python38\lib\site-packages\torch\lib, in Windows 10's file explorer it shows that the file caffe2_nvrtc.dll is in fact present so I don't understand why this isn't working. I am using Python 3.8 using the Thonny IDE on Windows 10. To install pytorch I used the command pip3.8 install torch===1.5.1 torchvision===0.6.1 -f https://download.pytorch.org/whl/torch_stable.html.

How do I fix this problem?

High resolution output?

Hey! Glad to see you're still working on neural-style. Very much intrigued by the PyTorch implementation, so I dropped a copy on my windows box to play around with.

I was curious if you've continued to experiment with scaling and producing high resolution output and how you may have gone about achieving that with the PyTorch implementation. Any differences to report? Any changes in the GPU memory ceilings / resolution limits? Any new scaling scripts to share?

Missing final A7 reference

Love the script and it runs smoothly. However I received an error which pointed to a file not being available. I did some digging and unless I'm mistaken, the last line of code should be pointing to A7.png instead of .png.

!python neural-style-pt/neural_style.py -style_image '/content/gdrive/MyDrive/NSPT/style/style1' -style_weight 1500 -style_scale 0.5 -content_image '/content/gdrive/MyDrive/NSPT/input/input.jpg' -content_weight 0 -init image -init_image '/content/gdrive/MyDrive/NSPT/output/.png' -learning_rate 1 -print_iter 50 -save_iter 0 -image_size 7200 -num_iterations 10 -model_file '/content/gdrive/MyDrive/NSPT/checkpoints/nin_imagenet.pth' -content_layers relu0,relu1 -style_layers relu0,relu1 -optimizer adam -output_image '/content/gdrive/MyDrive/NSPT/output/A8-FINAL.png' -tv_weight 0 -original_colors 0 -backend cudnn

from '/content/gdrive/MyDrive/NSPT/output/.png'

to '/content/gdrive/MyDrive/NSPT/output/A7.png'

2nd file not found when using a list of 2 style images

First of all, congratulation for you nice job !

I always have an error "file not found" on the 2nd file of the list, when using a list of 2 style images:

python -u ./neural_style.py -content_image "/c/Users/A455435/w/m/content/IN.jpg" -style_image "/c/Users/A455435/w/m/style/mix/A.jpg,/c/Users/A455435/w/m/style/mix/B.jpg" -style_blend_weights 5,5 -output_image "/c/Users/A455435/w/m/output/IN(Albena_04XX)px100,cw5e0,sw2e2,ss0.8,oc0,it2,i.jpg" -print_iter 1 -save_iter 0 -image_size 100 -num_iterations 2 -content_weight 5e0 -style_weight 2e2 -style_scale 0.8 -original_colors 0 -init image -seed 7 -gpu c
VGG-19 Architecture Detected
Successfully loaded models/vgg19-d01eb7cb.pth
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
Traceback (most recent call last):
File "./neural_style.py", line 468, in
main()
File "./neural_style.py", line 75, in main
img_caffe = preprocess(image, style_size).type(dtype)
File "./neural_style.py", line 336, in preprocess
image = Image.open(image_name).convert('RGB')
File "C:UsersA455435AppDataLocalProgramsPythonPython37libsite-packagesPILImage.py", line 2843, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/c/Users/A455435/w/m/style/mix/B.jpg'

While the file exists:
$ ls /c/Users/A455435/w/m/style/mix/B.jpg
/c/Users/A455435/w/m/style/mix/B.jpg

Any idea of what could be the reason ?

Regards
Jerome

Fork implementing multi-region spatial control

First of all, this is a great repo! It seems a bit faster and more memory efficient than the original lua-based neural-style.

I've made a fork of this repo trying to add masked style transfer as described by Gatys, and going off of the gist you wrote for the lua version.

I've almost got it working, but my implementation is suffering from two bugs. The first is that, testing with two style images and segmentations, my implementation seems only to get gradients for the first mask but not the second.

So for example, the following command:

python neural_style.py -backend cudnn -style_image examples/inputs/cubist.jpg,examples/inputs/starry_night.jpg -style_seg examples/segments/cubist.png,examples/segments/starry_night.png -content_seg examples/segments/monalisa.png -color_codes white,black

produces the following output:

where the first style (cubist) and corresponding segmentation get good gradients and works in the mask provided, but the second mask (starry night) has little or no gradient signal.

By simply swapping the order of the style images, as in:

python neural_style.py -backend cudnn -style_image examples/inputs/starry_night.jpg,examples/inputs/cubist.jpg -style_seg examples/segments/starry_night.png,examples/segments/cubist.png -content_seg examples/segments/monalisa.png -color_codes white,black

I get the opposite effect where only the starry night style works and the cubist style in its mask is not there.

I have been trying to debug this, checking the masks, and everything seems right to me, and I can't figure out the problem. This was almost a pytorch mirror of what you made in your gist, which does appear to work fine. I'm not sure if there's some typo I'm missing or something deeper.

Additionally, loss.backward() without keeping the gradients with retain_graph=True produces a runtime error (RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.), which makes me think I setup the graph wrong.

If you are able to see what I'm doing wrong so that we can fix it, I'd love to see this implemented in PyTorch. I think it would be a really nice addition to the repo.

Artifacts compared to Lua version

Coming back to this repo after a long break, interested in developing this further...

I was just comparing this implementation to the original Lua version and noticed something I hadn't before. The two produce almost identical results, but the PyTorch version appears to produce very subtle artifacts.

The following is an example, using Hokusai as the style image. On the left side is neural-style (lua), on the right side is neural-style-pt.

Notice in scattered places the presence of high-frequency discolorations, often in almost checkerboard-like patterns. These do not appear in the Lua version. If you zoom in on a few parts of the neural-style-pt versions, you can see them clearly. Notice the pink and green checkers.

This generally happens consistently for any combination of content and style images, although for some style images the artifacts are more obvious. Sometimes obvious discolorations appear, other times they are smaller, giving the output an almost grainy appearance. The artifacts can be reduced by increasing -tv_weight but at the expense of content/style reconstruction, and even then it's still visible.

I tried fixing it a few ways. Clamping the image between iterations (not just at the end) didn't fix it. I tried playing with the TVLoss module. For example, changing

self.loss = self.strength * (torch.sum(torch.abs(self.x_diff)) + torch.sum(torch.abs(self.y_diff)))

to an L2-loss, i.e.

self.loss = self.strength * (torch.sum(torch.pow(torch.pow(self.x_diff, 2)) + torch.sum(torch.pow(self.y_diff, 2)), 2))

also did not get rid of the artifact (I tried this because my reading of the TVLoss formula is that they use L2-loss not absolute values. But I'm not sure this makes a big difference.

The artifact is very subtle, but I'm hoping to fix it, as I'd like to produce more print-quality images in the future, and multi-stage or multi-scaled techniques on top may amplify the artifact. I wonder if you have any idea what might be causing this or what could potentially fix it.

Running on older GPU

Great work! I am trying to install the application and I have been running into issue because my GPU is too old for the latest putorch version.

What version of python and pytorch are you using?

I am starting from this docker image built with the Dockerfile containing:

FROM nvidia/cuda:8.0-cudnn5-devel-ubuntu14.04
ENV MINICONDA /opt/miniconda
ENV PATH ${MINICONDA}/bin:$PATH
RUN apt-get update && apt-get install -y wget git
RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -P /tmp
RUN bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p $MINICONDA
RUN rm /tmp/Miniconda3-latest-Linux-x86_64.sh
RUN conda install -y pytorch=0.3.0.0 torchvision -c pytorch
RUN mkdir /app 
WORKDIR /app
RUN git clone https://github.com/ProGamerGov/neural-style-pt.git 
WORKDIR /app/neural-style-pt
RUN python models/download_models.py

but I get this error when I run : python neural_style.py -gpu 0 -backend cudnn -print_iter 1

VGG-19 Architecture Detected
Successfully loaded models/vgg19-d01eb7cb.pth
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
Setting up style layer 2: relu1_1
Setting up style layer 7: relu2_1
Setting up style layer 12: relu3_1
Setting up style layer 21: relu4_1
Setting up content layer 23: relu4_2
Setting up style layer 30: relu5_1
Capturing content targets
nn.Sequential (
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> output]
  (1): nn.TVLoss
  (2): nn.Conv2d (3 -> 64, 3x3, 1,1, 1,1)
  (3): nn.ReLU
  (4): nn.StyleLoss
  (5): nn.Conv2d (64 -> 64, 3x3, 1,1, 1,1)
  (6): nn.ReLU
  (7): nn.MaxPool2d(2x2, 2,2)
  (8): nn.Conv2d (64 -> 128, 3x3, 1,1, 1,1)
  (9): nn.ReLU
  (10): nn.StyleLoss
  (11): nn.Conv2d (128 -> 128, 3x3, 1,1, 1,1)
  (12): nn.ReLU
  (13): nn.MaxPool2d(2x2, 2,2)
  (14): nn.Conv2d (128 -> 256, 3x3, 1,1, 1,1)
  (15): nn.ReLU
  (16): nn.StyleLoss
  (17): nn.Conv2d (256 -> 256, 3x3, 1,1, 1,1)
  (18): nn.ReLU
  (19): nn.Conv2d (256 -> 256, 3x3, 1,1, 1,1)
  (20): nn.ReLU
  (21): nn.Conv2d (256 -> 256, 3x3, 1,1, 1,1)
  (22): nn.ReLU
  (23): nn.MaxPool2d(2x2, 2,2)
  (24): nn.Conv2d (256 -> 512, 3x3, 1,1, 1,1)
  (25): nn.ReLU
  (26): nn.StyleLoss
  (27): nn.Conv2d (512 -> 512, 3x3, 1,1, 1,1)
  (28): nn.ReLU
  (29): nn.ContentLoss
  (30): nn.Conv2d (512 -> 512, 3x3, 1,1, 1,1)
  (31): nn.ReLU
  (32): nn.Conv2d (512 -> 512, 3x3, 1,1, 1,1)
  (33): nn.ReLU
  (34): nn.MaxPool2d(2x2, 2,2)
  (35): nn.Conv2d (512 -> 512, 3x3, 1,1, 1,1)
  (36): nn.ReLU
  (37): nn.StyleLoss
)
Traceback (most recent call last):
  File "neural_style.py", line 409, in <module>
    main()
  File "neural_style.py", line 149, in main
    net(content_image)
  File "/opt/miniconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/miniconda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/opt/miniconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/miniconda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 277, in forward
    self.padding, self.dilation, self.groups)
  File "/opt/miniconda/lib/python3.6/site-packages/torch/nn/functional.py", line 90, in conv2d
    return f(input, weight, bias)
TypeError: argument 0 is not a Variable

Any idea? Is there a hard dependency on cuda 9.1 cudnn 7.1?

Missing -normalize_gradient parameter

Hi, I noticed this is based on the jcjohnson code, but his has a param called -normalize_gradient tat is iplemented on lines 130-149 ish on https://github.com/jcjohnson/neural-style/blob/master/neural_style.lua

Would you consider porting his lua code of this feature and perhaps other normalization approaches?

Thanks!

Print seed used?

Could it be added to print the seed used? Useful for when want to reproduce the result of a random seed after-the-fact

I believe it is as simple as

print(f"Seed used: {torch.seed()}")

But I not familiar enough with Torch or ML libraries in general to know if this is the correct seed and/or if there are side effects to this.

Image Size

Would it be possible to have the default output image size be the same size as the content image?

Various Tests could use settings recommendations

Hi, thanks again for providing this software.

I just did a bunch of various tests mainly messing around with content weight and content style to try to achieve various looks. I just posted a blog post about my findings which will be a guide for me in the future on what settings I want to use to get the result I'm looking for. I figured I would share it here if thats ok and if you have any other suggestions on options I could try.

https://ibareitall.com/neural-style-transfer-tests/

Problem with channel_pruning and other models

Hello @ProGamerGov

Thanks for this great repo. VGG and NIN models work like a charm, but using Voltax3 from u/vic8760, I encountered problems with channel_pruning and nyud-fcn32s-color-heavy.
With channel_pruning, it returns me this log:

NIN Architecture Detected
Traceback (most recent call last):
  File "neural_style.py", line 409, in <module>
    main()
  File "neural_style.py", line 56, in main
    cnn, layerList = loadCaffemodel(params.model_file, params.pooling, params.gpu)  
  File "/home/GitHub/style_transfer/pytorch_style/CaffeLoader.py", line 136, in loadCaffemodel
    cnn.load_state_dict(torch.load(model_file))
  File "/home/anaconda3/envs/style/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for NIN:
	Missing key(s) in state_dict: "features.4.weight", "features.4.bias", "features.9.weight", "features.9.bias", "features.11.weight", "features.11.bias", "features.16.weight", "features.16.bias", "features.18.weight", "features.18.bias", "features.22.weight", "features.22.bias". 
	Unexpected key(s) in state_dict: "classifier.0.weight", "classifier.0.bias", "classifier.3.weight", "classifier.3.bias", "features.5.weight", "features.5.bias", "features.10.weight", "features.10.bias", "features.12.weight", "features.12.bias", "features.17.weight", "features.17.bias", "features.19.weight", "features.19.bias", "features.21.weight", "features.21.bias", "features.28.weight", "features.28.bias". 
	size mismatch for features.0.weight: copying a param with shape torch.Size([24, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([96, 3, 11, 11]).
	size mismatch for features.0.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([96]).
	size mismatch for features.2.weight: copying a param with shape torch.Size([22, 24, 3, 3]) from checkpoint, the shape in current model is torch.Size([96, 96, 1, 1]).
	size mismatch for features.2.bias: copying a param with shape torch.Size([22]) from checkpoint, the shape in current model is torch.Size([96]).
	size mismatch for features.7.weight: copying a param with shape torch.Size([51, 41, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 96, 5, 5]).
	size mismatch for features.7.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for features.14.weight: copying a param with shape torch.Size([111, 89, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 256, 3, 3]).
	size mismatch for features.14.bias: copying a param with shape torch.Size([111]) from checkpoint, the shape in current model is torch.Size([384]).
	size mismatch for features.24.weight: copying a param with shape torch.Size([512, 228, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 1, 1]).
	size mismatch for features.24.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
	size mismatch for features.26.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([1000, 1024, 1, 1]).
	size mismatch for features.26.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1000]).

Like model saved hasn't the expected structure.
And with nyud-fcn32s-color-heavy, the message isn't clear to me:

Traceback (most recent call last):
  File "neural_style.py", line 409, in <module>
    main()
  File "neural_style.py", line 56, in main
    cnn, layerList = loadCaffemodel(params.model_file, params.pooling, params.gpu)  
  File "/home/GitHub/style_transfer/pytorch_style/CaffeLoader.py", line 135, in loadCaffemodel
    cnn, layerList = modelSelector(str(model_file).lower(), pooling)
  File "/home/GitHub/style_transfer/pytorch_style/CaffeLoader.py", line 119, in modelSelector
    raise ValueError("Model architecture not recognized.")
ValueError: Model architecture not recognized.

In the wiki it's said that these models work. I try different style and content layers names. Maybe I am doing something wrong?

Thanks in advance

Understanding Multidevice Strategy

I have been trying to figure out how to max out both of my gpu's that are in my system.

Tue Oct 13 15:15:00 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  TITAN RTX           Off  | 00000000:01:00.0 Off |                  N/A |
| 41%   41C    P8    15W / 280W |    292MiB / 24220MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 00000000:02:00.0 Off |                  N/A |
| 21%   50C    P8     6W / 180W |      2MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2061      G   /usr/lib/xorg/Xorg                191MiB |
|    0   N/A  N/A      2745      G   ...mviewer/tv_bin/TeamViewer       13MiB |
|    0   N/A  N/A      2949      G   /usr/bin/gnome-shell               83MiB |
+-----------------------------------------------------------------------------+

GPU 0 has the most memory,

I'm trying to understand the -multidevice_strategy, how many layers are their.. its not very clear to me what would be the best for 2 gpus, one with more memory than the other.. or at least a starting off point..

I have just tried the value of 20 and this was the result..

Sorry dumb question about running the script.

How do I run the scripts through Conda? When I try to run it through Conda, it just keeps trying to open Bash and run it through that.

I edited the starry_stanford.sh script, and am trying to run it, someone help me out please.

Feature: Play with Covariance instead of Gram?

I watched a presentation from Artomatix and they have some arguments for using a covariance loss instead of a gram loss. You can flip between the two (I think it's correct..) by doing

class GramMatrix(nn.Module):

    def forward(self, input):
        B, C, H, W = input.size()
        x_flat = input.view(C, H * W)

        #Add this for covariance loss
        x_flat = x_flat - x_flat.mean(1).unsqueeze(1)

        return torch.mm(x_flat, x_flat.t())

I didn't experiment with it much, but using the default content/style at 1024, you get these:
Gram Loss:

Covariance Loss:

I wouldn't say it's better, but it is interesting it adds more texture to the sky. Might have some utility for more textured styles?

Thoughts?

p.s. Nice implementation! Happy there's a true to jcjohson, cuda 10, pytorch impl

Can't use commands.

C:\Users\USER\Desktop\AI Workspace\DeepDream\neural-style-pt-master>-style_image C:\Users\USER\Desktop\AI Workspace\DeepDream\neural-style-pt-master.golden_gate.jpg
'-style_image' is not recognized as an internal or external command,
operable program or batch file.
Am I using this wrong, or is it broken?

Unpickling stack underflow

When I try to run models/download_models.py, I get the following error:
'''
Downloading the VGG-19 model
Traceback (most recent call last):
File "models/download_models.py", line 10, in
sd = load_url("https://web.eecs.umich.edu/~justincj/models/vgg19-d01eb7cb.pth")
File "/home/wroge/anaconda3/envs/neural-style-env/lib/python3.8/site-packages/torch/hub.py", line 559, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/home/wroge/anaconda3/envs/neural-style-env/lib/python3.8/site-packages/torch/serialization.py", line 595, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/wroge/anaconda3/envs/neural-style-env/lib/python3.8/site-packages/torch/serialization.py", line 764, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: unpickling stack underflow
'''

How we can call it from code?

How we can call it from code like call a function to generate style image?

why use caffe vgg model

why not use pytorch give the default vgg model?

The num_corrections default value is actually 100, not 0

In neural-style-pt, (and the original neural-style.lua, actually), if params.lbfgs_num_corrections is not set, the default is 0 (line 29). Then, the optim.history object is only updated if params.num_corrections > 0:

       #line 56
        if params.lbfgs_num_correction > 0:
            optim_state['history_size'] = params.lbfgs_num_correction

However, the torch.optim.lbfgs class uses a default value of 100 if history_size is not set:

torch.optim.LBFGS(params, lr=1, max_iter=20, max_eval=None, tolerance_grad=1e-05, tolerance_change=1e-09, history_size=100, line_search_fn=None)

See the lbfgs section:
https://pytorch.org/docs/stable/optim.html

Thus, if somebody tries to set lbfgs_num_corrections to 0 (the default) to save memory they will actually use size 100.

init_image

What does it mean to replace the initialization image?

Error, But Only When Certain Image Combos Are Used

I get the following error, but only when using specific image combinations. I can use the content with other styles or style with other content. Very confusing.

Traceback (most recent call last):
File "D:\Neural Style Python\style_transfer_GUI.py", line 513, in
main()
File "D:\Neural Style Python\style_transfer_GUI.py", line 306, in main
optimizer.step(feval)
File "C:\Users\levic\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\optim\lbfgs.py", line 307, in step
orig_loss = closure()
File "D:\Neural Style Python\style_transfer_GUI.py", line 286, in feval
net(img)
File "C:\Users\levic\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "C:\Users\levic\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\container.py", line 92, in forward
input = module(input)
File "C:\Users\levic\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "C:\Users\levic\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\conv.py", line 345, in forward
return self.conv2d_forward(input, self.weight)
File "C:\Users\levic\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\conv.py", line 342, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 8.16 GiB (GPU 0; 11.00 GiB total capacity; 439.38 MiB already allocated; 8.16 GiB free; 126.62 MiB cached)

Saving Models

So I am 99% certain that the answer is that we can not save models, but I have to ask. Is it possible? Don't get me wrong, I get way better results here than I do with something like the repo below, but it would help with styling larger images and videos. If it's not possible, now worries, still love it.

https://github.com/lengstrom/fast-style-transfer

Multigpu Capability

Fantastic work! Manage to get it to run on win10.

Just curious is it capable of the -multigpu argument like the one that runs on torch7?

AttributeError: 'list' object has no attribute 'double' in normalizing tendor

When running the script I run into the following error. All goes well until the normalize call on the tensor. Hope this is enough logging to be of help:

VGG-19 Architecture Detected
Successfully loaded models/vgg19-d01eb7cb.pth
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
Traceback (most recent call last):
File "neural_style.py", line 409, in
main()
File "neural_style.py", line 58, in main
content_image = preprocess(params.content_image, params.image_size).type(dtype)
File "neural_style.py", line 293, in preprocess
tensor = Normalize(rgb2bgr(Loader(image) * 256)).unsqueeze(0)
File "C:\Users\Rwar\Anaconda3\lib\site-packages\torchvision\transforms\transforms.py", line 61, in call
img = t(img)
File "C:\Users\Rwar\Anaconda3\lib\site-packages\torchvision\transforms\transforms.py", line 164, in call
return F.normalize(tensor, self.mean.double(), self.std.double(), self.inplace)
AttributeError: 'list' object has no attribute 'double'
PS C:\WS\neural-style-pt>

vgg model does not calculate the style and content loss

When I use the vgg model, it does not calculate the content and style loss. It only prints out the total loss. The NIN model works just fine.

Error: `Missing Keys` when loading VGG weights

System Environment
OS: Ubuntu 18.04
Python Version: 3.6.7
PyTorch Version: 0.4.1

Attached is a screenshot of the error message when loading VGG weights.

Image Size to VRAM Relationship

How large of an image can people make using the L-BFGS optimizer? And how much VRAM are you working with? I max out somewhere around 1664 when using -cudnn_autotune on a 1080ti with 11 gigs.

RuntimeError: CUDA error: invalid device ordinal with starry_stanford.sh

Hi, I'm trying to run the script above to see if my system can handle and create larger images based upon your script.

I added -optimizer adam and using the NIN model for lower memory gpus.

Here is my output that fails eventually...

RuntimeError: CUDA error: invalid device ordinal
NIN Architecture Detected
Successfully loaded models/nin_imagenet.pth
conv1: 96 3 11 11
cccp1: 96 96 1 1
cccp2: 96 96 1 1
conv2: 256 96 5 5
cccp3: 256 256 1 1
cccp4: 256 256 1 1
conv3: 384 256 3 3
cccp5: 384 384 1 1
cccp6: 384 384 1 1
conv4-1024: 1024 384 3 3
cccp7-1024: 1024 1024 1 1
cccp8-1024: 1000 1024 1 1
Traceback (most recent call last):
  File "/home/gateway/work/neural-style-software/neural-style-pt/neural_style.py", line 468, in <module>
    main()
  File "/home/gateway/work/neural-style-software/neural-style-pt/neural_style.py", line 157, in main
    net = setup_multi_device(net)
  File "/home/gateway/work/neural-style-software/neural-style-pt/neural_style.py", line 328, in setup_multi_device
    new_net = ModelParallel(net, params.gpu, params.multidevice_strategy)
  File "/home/gateway/work/neural-style-software/neural-style-pt/CaffeLoader.py", line 110, in __init__
    self.chunks = self.chunks_to_devices(self.split_net(net, device_splits.split(',')))
  File "/home/gateway/work/neural-style-software/neural-style-pt/CaffeLoader.py", line 134, in chunks_to_devices
    chunk.to(self.device_list[i])
  File "/home/gateway/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 426, in to
    return self._apply(convert)
  File "/home/gateway/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/gateway/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 224, in _apply
    param_applied = fn(param)
  File "/home/gateway/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 424, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: invalid device ordinal

https://github.com/ProGamerGov/neural-style-pt/blob/master/examples/scripts/starry_stanford.sh

Nvidia info

btw I'm using GPU 1 since it has the most memory and not using the primary display..

(base) gateway@gateway-media:~/work/neural-style-software/neural-style-pt$ nvidia-smi
Tue May  5 14:34:21 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   54C    P8     4W / 120W |    195MiB /  6078MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 00000000:02:00.0 Off |                  N/A |
| 21%   56C    P8     6W / 180W |      2MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2161      G   /usr/lib/xorg/Xorg                           101MiB |
|    0      2661      G                                                 11MiB |
|    0      2856      G   /usr/bin/gnome-shell                          77MiB |
+-----------------------------------------------------------------------------+
(base) gateway@gateway-media:~/work/neural-style-software/neural-style-pt$

thoughts?

Procedure for converting caffe models to .pth

I assume that the vgg16 and vgg19 models in the download scripts are the same models as the initial lua implementation, except that they were converted them to pytorch. Do you recall how you did that?

I ask partly because I'm interested in doing the same, but also because I'm curious whether the mean values of used in the preprocess and deprocess Normalization steps correspond to the correct models -- I'm seeing artifacts that I hadn't seen in the lua implementation and am trying to track it down.

(Nice work, btw!)

Why does multiscale generation outputs very different results for two almost identical images?

I'm trying to apply style transfer to a video frame by frame, but when I use multiscale generation the results vary heavily even for images that are almost identical. I tried without multiscale generation and I didn't have this issue, but the resulting image quality was worse. Is there a way to use multiscale generation and avoid this?

Where to run the scripts?

Sorry for noob question, the usage guide seems to assume that the user knows how to use Python. I have managed to install Python 3.7, Anaconda and Pytorch but I'm stuck at running "python models/download_models.py" because I don't know where to run it. If I run it in Python 3.7 I get a syntax error, and running it in Anaconda gives this error:

(base) C:\Users\User>python models/download_models.py
python: can't open file 'models/download_models.py': [Errno 2] No such file or directory

Also I don't understand the usage part, it just tells you that this is how you use it:

python neural_style.py -style_image <image.jpg> -content_image <image.jpg>

Where do you type this line?

Thanks

No images found after running script

The script was working fine for me for months. When I tried to generate some new images it ran as it usually does and went through all the iterations. But the output files are nowhere to be found. I even tried giving a full path in -output. Any ideas??

my script"

python neural_style.py -style_image /mnt/e/GFX\ Textures/Patterns/Photoshop\ Pattern\ Pictures/4.jpg -content_image /media/max/90E83424E8340ACC/Users/Ben\ Stiller/Documents/Photoshop\ Projects/penguins/penguin-29.png -output_image /home/max/Documents/github_downloads/neural-style-pt/penguin-45.png -model_file models/nin_imagenet.pth -gpu 0 -backend cudnn -num_iterations 1000 -seed 123 -content_layers relu0,relu3,relu7,relu12 -style_layers relu0,relu3,relu7,relu12 -content_weight 10 -style_weight 500 -image_size 600 -optimizer adam

How to reduce the batch size?

I'm wanting to reduce the batch size in order to get around the dreaded "CUDA out of memory" error. I'm wanting to do this to be able to generate much larger images as outputs at least ~10meg. I don't see an Argument for this so thought perhaps this is hard coded?

FileNotFound error, as well as a few other errors.

I'm getting these errors no matter the picture combo I used.

I tried lowering the resolutions of the pictures in the script because I thought it was failing to compute, and therefore failing to save which caused the next function to error, but lowering the resolution didn't fix it.

I'm running on CUDA with CUDnn, and I'm running it on an i7-7700k + RTX 2080 Super. I've run higher res non-script style transfers that haven't failed though so I'm not too sure what the problem may be.

I thought it may be because of edits I made to the starry_stanford.sh script, but I redownloaded and ran with default parameters, and it still failed with the exact same errors.

Using without CUDA?

It's been a while but used to be able to run https://github.com/jcjohnson/neural-style without CUDA support (Macbook Pro)

Is this project able to do same?
I see AssertionError: Torch not compiled with CUDA enabled after successful install and unsure if required or how to disable.

Is pytorch actually required?

I see that pytorch is listed as a dep, but when I created this Google Collaboratory notebook, I can run the neural network just fine without actually installing the pytorch. Is there something I'm missing? Perhaps the neural-style-pt install file takes care of that?

Access forbidden to VGG-19 model

Issue

VGG-19 cannot be downloaded using the URL to the S3 bucket.

Code

$ python models/download_models.py

Downloading the VGG-19 model
Downloading: "https://s3-us-west-2.amazonaws.com/jcjohns-models/vgg19-d01eb7cb.pth" to /Users/louis/.cache/torch/checkpoints/vgg19-d01eb7cb.pth
Traceback (most recent call last):
  File "models/download_models.py", line 10, in <module>
    sd = load_url("https://s3-us-west-2.amazonaws.com/jcjohns-models/vgg19-d01eb7cb.pth")
  File ".../env/lib/python3.7/site-packages/torch/hub.py", line 492, in load_state_dict_from_url
    download_url_to_file(url, cached_file, hash_prefix, progress=progress)
  File ".../env/lib/python3.7/site-packages/torch/hub.py", line 391, in download_url_to_file
    u = urlopen(url)
  File ".../env/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File ".../env/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File ".../env/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File ".../env/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File ".../env/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File ".../env/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

Or using curl:

$ curl https://s3-us-west-2.amazonaws.com/jcjohns-models/vgg19-d01eb7cb.pth

<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>6A3FF8D1FF012D4B</RequestId><HostId>bVI6/cnmD0zRB91VGnfWNpATBNLysl/HTErEmZNPqnudNSPrRUC/dKbZ6KANayZ4P8oYYF1vCbo=</HostId></Error>%

AssertionError: Torch not compiled with CUDA enabled

can you guide me how to pass this error, I am very beginner .

NIN Upres

Looking at the post here:
https://github.com/jcjohnson/neural-style/wiki/Techniques-For-Increasing-Image-Quality-Without-Buying-a-Better-GPU

It looks like I will run the standard VGG code and then when that is done, I replace the model path with nin_imagenet.pth? This will allow me to upres images that I make? Or is there something that I am missing?

Cannot use conv layers in vvg-19

Hello, I'm using something like:

python neural_style.py -style_image myStyle.png -content_image myImage.jpg -output_image profile.png -gpu 0 -backend cudnn -num_iterations 5000 -image_size 1000 -style_weight 600 -style_scale 1.2 -style_layers conv1_1

or any convX_Y layer for style or content and I get the following error:

Running optimization with L-BFGS
Traceback (most recent call last):
File "neural_style.py", line 455, in
main()
File "neural_style.py", line 257, in main
optimizer.step(feval)
File "C:\Users[username]\neural-style-pt-master\lib\site-packages\torch\optim\lbfgs.py", line 307, in step
orig_loss = closure()
File "neural_style.py", line 248, in feval
loss.backward()
File "C:\Users[username]\neural-style-pt-master\lib\site-packages\torch\tensor.py", line 150, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "C:\Users[username]\neural-style-pt-master\lib\site-packages\torch\autograd_init_.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [64, 791000]], which is output 0 of ViewBackward, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

When I use the ones in your example code (-content_layers relu0,relu3,relu7,relu12 -style_layers relu0,relu3,relu7,relu12), It's just a black image outputted, but no errors at least thrown.

When I try the default which uses (-style_layers relu1_1,relu2_1,relu3_1,relu4_1,relu5_1 -content_layers relu4_2) it works fine.

Calculator for maximum image size

Is there a potential to make a tool to calculate maximum image size that can be archieved given certain GPU memory? for example its important for vfx guyes to render in 1920x1080, but i have no clue how much memory i would need? I know with other implementations that AVG pooling type always requires same amount of memory and would not fail once it starts, MAX pooling requires about ~25% more, and would fail on some combination of images and styles at a point it overflows

Advice: Ability to create video?

Hi, thanks for providing this amazing style transfer tool. In the past I was using cysmiths nerual style tf transfer which stopped working for the new titan RTX card I put in. I see similarities in what you guys have done and have been testing your version with much joy (planning on doing a blog post about some settings I found useful).

Anyhow I was wondering if we could use this for video some how because some of the video style transfer stuff out their is 4-5 years old and use optical flow and deepmatching which is mainly cpu intensive and any gpu ones dont see to work with the system.

Any thoughts on anyone using your code to create videos? is the seed value what I would need to render out various frames along with maybe nvidias new optical flow system or?

Anyhow thank you again!

Styling Transparent Backgrounds

Is there a way to style an image that has a transparent background? I tried feeding it a .png and using a .png output, but the transparent part of the content got styled.

progamergov / neural-style-pt Goto Github PK

neural-style-pt's Introduction

neural-style-pt

Content / Style Tradeoff

Style Scale

Multiple Style Images

Style Interpolation

Transfer style but not color

Setup:

Usage

Frequently Asked Questions

Memory Usage

Speed

Multi-GPU scaling

Implementation details

Citation

neural-style-pt's People

Contributors

Stargazers

Watchers

Forkers

neural-style-pt's Issues

Issue

Code

Recommend Projects

Recommend Topics

Recommend Org