Giter Site home page Giter Site logo

cartoon-gan's Introduction

Cartoon-GAN

Original_0

Paper: https://arxiv.org/abs/2005.07702

Description

This project takes on the problem of transferring the style of cartoon images to real-life photographic images by implementing previous work done by CartoonGAN. We trained a Generative Adversial Network(GAN) on over 60 000 images from works by Hayao Miyazaki at Studio Ghibli.

To the people asking for the dataset, im sorry but as the material is copyright protected i cannot share the dataset.

Dependencies

  1. Install Anaconda from https://www.anaconda.com/

  2. Install pytorch at: https://pytorch.org/get-started/locally/

  3. Install dependencies:

    python -m pip install tqdm pillow numpy matplotlib opencv-python
    
  4. For predicting videos you will also need ffmpeg

Weights

Weights for the presented models can be found here

Training

All training code can be found in experiment.ipynb

Predict

Predict by running predict.py.

Example:

python predict.py -i C:/folder/input_image.png -o ./output_folder/output_image.png

Predictions can be made on images, videos or a folder of images/videos.

Demonstration

Image # Original CartoonGAN GANILLA Our implementation
1 Original_1 CartoonGAN_1 GANILLA_1 Ours_1
2 Original_2 CartoonGAN_2 GANILLA_2 Ours_2
3 Original_3 CartoonGAN_3 GANILLA_3 Ours_3
4 Original_4 CartoonGAN_4 GANILLA_4 Ours_4
5 Original_5 CartoonGAN_5 GANILLA_5 Ours_5
6 Original_6 CartoonGAN_6 GANILLA_6 Ours_6
7 Original_7 CartoonGAN_7 GANILLA_7 Ours_7
8 Original_8 CartoonGAN_8 GANILLA_8 Ours_8
9 Original_9 CartoonGAN_9 GANILLA_9 Ours_9
10 Original_10 CartoonGAN_10 GANILLA_10 Ours_10
11 Original_11 CartoonGAN_11 GANILLA_11 Ours_11
12 Original_12 CartoonGAN_12 GANILLA_12 Ours_12
13 Original_13 CartoonGAN_13 GANILLA_13 Ours_13
14 Original_14 CartoonGAN_14 GANILLA_14 Ours_14
15 Original_15 CartoonGAN_15 GANILLA_15 Ours_15
16 Original_16 CartoonGAN_16 GANILLA_16 Ours_16
17 Original_17 CartoonGAN_17 GANILLA_17 Ours_17
18 Original_18 CartoonGAN_18 GANILLA_18 Ours_18
19 Original_19 CartoonGAN_19 GANILLA_19 Ours_19
20 Original_20 CartoonGAN_20 GANILLA_20 Ours_20

Citation

@misc{andersson2020generative,
      title={Generative Adversarial Networks for photo to Hayao Miyazaki style cartoons}, 
      author={Filip Andersson and Simon Arvidsson},
      year={2020},
      eprint={2005.07702},
      archivePrefix={arXiv},
      primaryClass={cs.GR}
}

cartoon-gan's People

Contributors

filipandersson245 avatar umonkey1975 avatar zimonitrome avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cartoon-gan's Issues

Upload weights for Discriminator

Hey can you upload the "trained_netD.pth" file. I am training on my own dataset, but I cannot continue without that file. It would be really helpful if you could share me that file. Thanks

Originally posted by @Yash619 in #4 (comment)

The paging file is too small for this operation to complete

Hello, I tried to train from the checkpoint left in the repository but I got the "The paging file is too small for this operation to complete" Error.
I tried reducing the batch size or image size but it made no difference.
Command run: python train.py
I changed the following to be the path to my own dataset
real_dataloader = get_dataloader("E:/Proiecte_Dizertatie/Dataset_ghibli/trainA/", size = image_size, bs = batch_size)
cartoon_dataloader = get_dataloader("E:/Proiecte_Dizertatie/Dataset_ghibli/trainB/", size = image_size, bs = batch_size, trfs=get_pair_transforms(image_size))

I also downloaded the pth files with the pretrained models.

OS: Windows 11
PyTorch version 1.10
CUDA toolkit version CUDA 10.1
NVIDIA driver version
GPU RTX 3060

Logs:
Starting Training Loop...
training epoch 0
Traceback (most recent call last):
File "", line 1, in
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "E:\Anaconda\envs\CartoonGAN\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "E:\Anaconda\envs\CartoonGAN\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "E:\Anaconda\envs\CartoonGAN\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "E:\Proiecte_Dizertatie\cartoon-gan\train.py", line 1, in
import torch
File "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch_init
.py", line 124, in
raise err
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.
Traceback (most recent call last):
File "train.py", line 158, in
train()
File "train.py", line 71, in train
for i, (cartoon_edge_data, real_data) in enumerate(zip(cartoon_dataloader, real_dataloader)):
File "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch\utils\data\dataloader.py", line 359, in iter
return self._get_iterator()
File "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch\utils\data\dataloader.py", line 918, in init
w.start()
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

Memory and pading error

Hello. Thanks for the other answer. I did that change and now I have a problem with the memory. If you can further help me?

RuntimeError: CUDA out of memory. Tried to allocate 134.00 MiB (GPU 0; 6.00 GiB total capacity; 5.15 GiB already allocated; 0 bytes free; 5.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I saw that someone else had the problem and you suggested lowering the batch size. I also did that and got (I changed it to 16):
RuntimeError: 3D or 4D (batch mode) tensor expected for input, but got: [ torch.cuda.FloatTensor{16,3,256,0} ]

I lowered the image size but it still goes out of memory.

And when I do both it gives:
RuntimeError: Padding size should be less than the corresponding input dimension, but got: padding (1, 1) at dimension 3 of input [16, 64, 128, 1]

Target size must be the same as input size

Hi, amazing work here.
I am trying to run the training loop to optimize a model to apply it to another artist.
I have created the smooth dataset for the artist's images (with every image containing the original and an edge smoothed version side by side), and got 30k images from Flickr.

When I run the training loop I get the following error:

ValueError: Target size (torch.Size([9, 1, 56, 56])) must be the same as input size (torch.Size([9, 1, 64, 72]))

Is there any processing to do with the Flickr images that I missed ?

Here is the complete log:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [4], line 49
     46 generated_pred = netD(generated_data)    #.view(-1)
     48 # Calculate discriminator loss on all batches.
---> 49 errD = adv_loss(cartoon_pred, generated_pred, edge_pred)
     51 # Calculate gradients for D in backward pass
     52 errD.backward()

File e:\sylvain-chomet-GAN\env\lib\site-packages\torch\nn\modules\module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File e:\sylvain-chomet-GAN\utils\loss.py:16, in AdversialLoss.forward(self, cartoon, generated_f, edge_f)
     14 def forward(self, cartoon, generated_f, edge_f):
     15     #print(cartoon.shape, self.cartoon_labels.shape)
---> 16     D_cartoon_loss = self.base_loss(cartoon, self.cartoon_labels)
     17     D_generated_fake_loss = self.base_loss(generated_f, self.fake_labels)
     18     D_edge_fake_loss = self.base_loss(edge_f, self.fake_labels)
...
   3147 if not (target.size() == input.size()):
-> 3148     raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
   3150 return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)

ValueError: Target size (torch.Size([9, 1, 56, 56])) must be the same as input size (torch.Size([9, 1, 64, 72]))

Thank you for this amazing work.

Not able to replicate results

Hello, so i want to use your work for further research hence I am using your repository to replicate your results on my dataset. But while training, I am not able to get cartoonish image but the output looks nearly same to real image. The problem is that Discriminator is not training properly. It always give near 0 outputs (Real like patch prediction) for cartoon images (around 0.3~0.5 mean), and also for real images (~0.01) hence the generator is not able to train properly to yeild cartoon images.

I have followed procedure as you told and also as mentioned in the original Cartoon GAN paper: I pretrained generator to reproduce real images (6k images from Flickr30k dataset), for 10 epochs then I pretrained Discriminator as a normal classifier (6k images from flickr, 4.6k anime images from 3 movies of Hayao : PoppingHill, Princess Mononoke, Spirited away and 4.6k corresponding smooth images) for 50 epochs (still having same problem as i described earlier). After this I trained the combination for 50 epochs on the same data (6k images from flickr, 4.6k anime images from 3 movies of Hayao : PoppingHill, Princess Mononoke, Spirited away and 4.6k corresponding smooth images).

Can you suggest what is the problem? I have not made any changes to your code except for writing new codes for pretraining Generator and Discriminator. Maybe the problem lies in the dataset? or am I doing something else wrong? please help me out.

It would be great if you could share your dataset with us. Thanks.

load video

Hi, nice work

predict.py -i ./input -o ./output_folder
Processing files:   0%|                                                                                                                                   | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "predict.py", line 113, in <module>
    predict_file(file_path, output_file_path)
  File "predict.py", line 55, in predict_file
    if mimetypes.guess_type(input_path)[0].startswith("image"):
AttributeError: 'NoneType' object has no attribute 'startswith'

I trying predicting videos (.mpeg format file), but AttributeError: 'NoneType' object has no attribute 'startswith'
can u help please ?
and what do you mean "predicting videos FFmpeg'"

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

I get this error when running the Training Loop.

I loaded the state like so:
netG.load_state_dict(torch.load("./checkpoints/trained_netG_original.pth"))

And the error happens here:
errG = BCE_loss(generated_pred, cartoon_labels) + content_loss(generated_data, real_data)

Here is the full log

RuntimeError                              Traceback (most recent call last)
Cell In [7], line 79
     77 print(type(generated_pred))
     78 print(type(cartoon_labels))
---> 79 errG = BCE_loss(generated_pred, cartoon_labels.to(device)) + content_loss(generated_data, real_data)
     81 # Calculate gradients for G
     82 errG.backward()

File e:\sylvain-chomet-GAN\env\lib\site-packages\torch\nn\modules\module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File e:\sylvain-chomet-GAN\utils\loss.py:40, in ContentLoss.forward(self, x1, x2)
     39 def forward(self, x1, x2):
---> 40     x1 = self.perception(x1)
     41     x2 = self.perception(x2)
     43     return self.omega * self.base_loss(x1, x2)

File e:\sylvain-chomet-GAN\env\lib\site-packages\torch\nn\modules\module.py:1130, in Module._call_impl(self, *input, **kwargs)
...
    452                     _pair(0), self.dilation, self.groups)
--> 453 return F.conv2d(input, weight, bias, self.stride,
    454                 self.padding, self.dilation, self.groups)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.