Giter Site home page Giter Site logo

cartoon-gan's Issues

load video

Hi, nice work

predict.py -i ./input -o ./output_folder
Processing files:   0%|                                                                                                                                   | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "predict.py", line 113, in <module>
    predict_file(file_path, output_file_path)
  File "predict.py", line 55, in predict_file
    if mimetypes.guess_type(input_path)[0].startswith("image"):
AttributeError: 'NoneType' object has no attribute 'startswith'

I trying predicting videos (.mpeg format file), but AttributeError: 'NoneType' object has no attribute 'startswith'
can u help please ?
and what do you mean "predicting videos FFmpeg'"

Not able to replicate results

Hello, so i want to use your work for further research hence I am using your repository to replicate your results on my dataset. But while training, I am not able to get cartoonish image but the output looks nearly same to real image. The problem is that Discriminator is not training properly. It always give near 0 outputs (Real like patch prediction) for cartoon images (around 0.3~0.5 mean), and also for real images (~0.01) hence the generator is not able to train properly to yeild cartoon images.

I have followed procedure as you told and also as mentioned in the original Cartoon GAN paper: I pretrained generator to reproduce real images (6k images from Flickr30k dataset), for 10 epochs then I pretrained Discriminator as a normal classifier (6k images from flickr, 4.6k anime images from 3 movies of Hayao : PoppingHill, Princess Mononoke, Spirited away and 4.6k corresponding smooth images) for 50 epochs (still having same problem as i described earlier). After this I trained the combination for 50 epochs on the same data (6k images from flickr, 4.6k anime images from 3 movies of Hayao : PoppingHill, Princess Mononoke, Spirited away and 4.6k corresponding smooth images).

Can you suggest what is the problem? I have not made any changes to your code except for writing new codes for pretraining Generator and Discriminator. Maybe the problem lies in the dataset? or am I doing something else wrong? please help me out.

It would be great if you could share your dataset with us. Thanks.

Target size must be the same as input size

Hi, amazing work here.
I am trying to run the training loop to optimize a model to apply it to another artist.
I have created the smooth dataset for the artist's images (with every image containing the original and an edge smoothed version side by side), and got 30k images from Flickr.

When I run the training loop I get the following error:

ValueError: Target size (torch.Size([9, 1, 56, 56])) must be the same as input size (torch.Size([9, 1, 64, 72]))

Is there any processing to do with the Flickr images that I missed ?

Here is the complete log:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [4], line 49
     46 generated_pred = netD(generated_data)    #.view(-1)
     48 # Calculate discriminator loss on all batches.
---> 49 errD = adv_loss(cartoon_pred, generated_pred, edge_pred)
     51 # Calculate gradients for D in backward pass
     52 errD.backward()

File e:\sylvain-chomet-GAN\env\lib\site-packages\torch\nn\modules\module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File e:\sylvain-chomet-GAN\utils\loss.py:16, in AdversialLoss.forward(self, cartoon, generated_f, edge_f)
     14 def forward(self, cartoon, generated_f, edge_f):
     15     #print(cartoon.shape, self.cartoon_labels.shape)
---> 16     D_cartoon_loss = self.base_loss(cartoon, self.cartoon_labels)
     17     D_generated_fake_loss = self.base_loss(generated_f, self.fake_labels)
     18     D_edge_fake_loss = self.base_loss(edge_f, self.fake_labels)
...
   3147 if not (target.size() == input.size()):
-> 3148     raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
   3150 return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)

ValueError: Target size (torch.Size([9, 1, 56, 56])) must be the same as input size (torch.Size([9, 1, 64, 72]))

Thank you for this amazing work.

Memory and pading error

Hello. Thanks for the other answer. I did that change and now I have a problem with the memory. If you can further help me?

RuntimeError: CUDA out of memory. Tried to allocate 134.00 MiB (GPU 0; 6.00 GiB total capacity; 5.15 GiB already allocated; 0 bytes free; 5.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I saw that someone else had the problem and you suggested lowering the batch size. I also did that and got (I changed it to 16):
RuntimeError: 3D or 4D (batch mode) tensor expected for input, but got: [ torch.cuda.FloatTensor{16,3,256,0} ]

I lowered the image size but it still goes out of memory.

And when I do both it gives:
RuntimeError: Padding size should be less than the corresponding input dimension, but got: padding (1, 1) at dimension 3 of input [16, 64, 128, 1]

Upload weights for Discriminator

Hey can you upload the "trained_netD.pth" file. I am training on my own dataset, but I cannot continue without that file. It would be really helpful if you could share me that file. Thanks

Originally posted by @Yash619 in #4 (comment)

The paging file is too small for this operation to complete

Hello, I tried to train from the checkpoint left in the repository but I got the "The paging file is too small for this operation to complete" Error.
I tried reducing the batch size or image size but it made no difference.
Command run: python train.py
I changed the following to be the path to my own dataset
real_dataloader = get_dataloader("E:/Proiecte_Dizertatie/Dataset_ghibli/trainA/", size = image_size, bs = batch_size)
cartoon_dataloader = get_dataloader("E:/Proiecte_Dizertatie/Dataset_ghibli/trainB/", size = image_size, bs = batch_size, trfs=get_pair_transforms(image_size))

I also downloaded the pth files with the pretrained models.

OS: Windows 11
PyTorch version 1.10
CUDA toolkit version CUDA 10.1
NVIDIA driver version
GPU RTX 3060

Logs:
Starting Training Loop...
training epoch 0
Traceback (most recent call last):
File "", line 1, in
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "E:\Anaconda\envs\CartoonGAN\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "E:\Anaconda\envs\CartoonGAN\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "E:\Anaconda\envs\CartoonGAN\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "E:\Proiecte_Dizertatie\cartoon-gan\train.py", line 1, in
import torch
File "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch_init
.py", line 124, in
raise err
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.
Traceback (most recent call last):
File "train.py", line 158, in
train()
File "train.py", line 71, in train
for i, (cartoon_edge_data, real_data) in enumerate(zip(cartoon_dataloader, real_dataloader)):
File "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch\utils\data\dataloader.py", line 359, in iter
return self._get_iterator()
File "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "E:\Anaconda\envs\CartoonGAN\lib\site-packages\torch\utils\data\dataloader.py", line 918, in init
w.start()
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "E:\Anaconda\envs\CartoonGAN\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

I get this error when running the Training Loop.

I loaded the state like so:
netG.load_state_dict(torch.load("./checkpoints/trained_netG_original.pth"))

And the error happens here:
errG = BCE_loss(generated_pred, cartoon_labels) + content_loss(generated_data, real_data)

Here is the full log

RuntimeError                              Traceback (most recent call last)
Cell In [7], line 79
     77 print(type(generated_pred))
     78 print(type(cartoon_labels))
---> 79 errG = BCE_loss(generated_pred, cartoon_labels.to(device)) + content_loss(generated_data, real_data)
     81 # Calculate gradients for G
     82 errG.backward()

File e:\sylvain-chomet-GAN\env\lib\site-packages\torch\nn\modules\module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File e:\sylvain-chomet-GAN\utils\loss.py:40, in ContentLoss.forward(self, x1, x2)
     39 def forward(self, x1, x2):
---> 40     x1 = self.perception(x1)
     41     x2 = self.perception(x2)
     43     return self.omega * self.base_loss(x1, x2)

File e:\sylvain-chomet-GAN\env\lib\site-packages\torch\nn\modules\module.py:1130, in Module._call_impl(self, *input, **kwargs)
...
    452                     _pair(0), self.dilation, self.groups)
--> 453 return F.conv2d(input, weight, bias, self.stride,
    454                 self.padding, self.dilation, self.groups)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.