sergeytulyakov / mocogan Goto Github PK

View Code? Open in Web Editor NEW

567.0 28.0 113.0 83.29 MB

MoCoGAN: Decomposing Motion and Content for Video Generation

Python 98.29% Dockerfile 1.71%

machine-learning computer-vision video-generator

mocogan's People

Contributors

Stargazers

Watchers

Forkers

xhuvom stevenlol jdc08161063 wanjinchang kastnerkyle johndpope xjwxjw ehfo0 xc35 hawklucky mkhodabandeh cometyang benjamesbabala zhixinshu ilovecv createamind beimingmaster lostinet 1292765944 zcrwind xingdi1990 maniyar2jaimin alphonses baifengbai shubhampachori12110095 mitmul esmaeilinia amarisai cv-apps afcarl vipulbjj ck853178967 peterzhousz hps0217vre kushagra1729 hbcbh1999 shehabk fireis kimballfrank suryabtps mubashirhanif owalnuto ominux zackarysin beautymess klqulei nonwcy vladyushchenko amirunpri2018 qin-research kekedan smallflyingpig xiaoye77 zhaoyang10 gitumarkk aniket1998 zoombapup oulin1031esti lschmidtke zhangqianhui joaanna yatindandi shikanggan dungbachviet zyong812 jorgeramirez donkeyshot21 pierfrancescoardino nevermore808 lingjun3033 ruanhaolalala zoonono sandeepbeniwal happybahman zhigangc raahii tianlinxu312 shuttlestone nindoumon phymhan cliffbao danishack zidaoziyan123 zhiyuanchen kangliu1112 flyinufo awesome-archive zisling joejiong edward223 code-ishwar kencan7749 edmontdants charlie-xiaoqi chenjj0702 bruinxiong anirudh-kumar-48 seunguk-do celia0u0 gaozhihan

mocogan's Issues

Code for Image-to-video Translation

Dear author,

You have mentioned that the mocogan model can be adapted to the image-to-video task. Would you mind to share your implementation?

Thank you so much

Inception Score on UCF101

Hi,

I am trying to reproduce the Inception score results on UCF101 dataset.
Could you please point out, which model and parameters(number of generated videos, splits) were used for stated result?
Did you use implementation from TGAN paper or other repository?

Thanks in advance!

EOFError: Ran out of input

I try to use it in Python3.

However, the error is reported:

python train.py --image_batch 32 --video_batch 32 --use_infogan --use_noise --noise_sigma 0.1 --image_discriminator PatchImageDiscriminator --video_discriminator CategoricalVideoDiscriminator --print_every 100 --every_nth 2 --dim_z_content 50 --dim_z_motion 10 --dim_z_category 4 /slow/junyan/VideoSynthesis/mocogan/data/actions logs/actions
{'--batches': '100000',
'--dim_z_category': '4',
'--dim_z_content': '50',
'--dim_z_motion': '10',
'--every_nth': '2',
'--image_batch': '32',
'--image_dataset': '',
'--image_discriminator': 'PatchImageDiscriminator',
'--image_size': '64',
'--n_channels': '3',
'--noise_sigma': '0.1',
'--print_every': '100',
'--use_categories': False,
'--use_infogan': True,
'--use_noise': True,
'--video_batch': '32',
'--video_discriminator': 'CategoricalVideoDiscriminator',
'--video_length': '16',
'': '/slow/junyan/VideoSynthesis/mocogan/data/actions',
'<log_folder>': 'logs/actions'}
/root/anaconda3/lib/python3.6/site-packages/torchvision/transforms/transforms.py:188: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
"please use transforms.Resize instead.")
/slow/junyan/VideoSynthesis/mocogan/data/actions/local.db
Traceback (most recent call last):
File "train.py", line 104, in
dataset = data.VideoFolderDataset(args[''], cache=os.path.join(args[''], 'local.db'))
File "/slow/junyan/VideoSynthesis/mocogan/src/data.py", line 24, in init
print(pickle.load(f))
EOFError: Ran out of input

Here is the code

class VideoFolderDataset(torch.utils.data.Dataset):
    def __init__(self, folder, cache, min_len=32):
        dataset = ImageFolder(folder)
        self.total_frames = 0
        self.lengths = []
        self.images = []
        print(cache)
        if cache is not None and os.path.exists(cache):
            with open(cache, 'rb') as f:
                print(pickle.load(f))
        else:
            for idx, (im, categ) in enumerate(
                    tqdm.tqdm(dataset, desc="Counting total number of frames")):
                img_path, _ = dataset.imgs[idx]
                shorter, longer = min(im.width, im.height), max(im.width, im.height)
                length = longer // shorter
                if length >= min_len:
                    self.images.append((img_path, categ))
                    self.lengths.append(length)

            if cache is not None:
                with open(cache, 'wb') as f:
                    pickle.dump((self.images, self.lengths), f)

        self.cumsum = np.cumsum([0] + self.lengths)
        print("Total number of frames {}".format(np.sum(self.lengths)))

Any updates on the code?

I was wondering if this code will be released before CVPR 2018 deadline.
We would like to do some comparisons.

Future frame prediction

Hi, according to the paper, you also did experiments with a variant of MoCoGAN on future frame prediction, I am interested in how that variant is constructed. Is the detail available to be released? Thank you!

ValueError: Expected target size (16, 16, 256, 256), got torch.Size([16])

Hi,I'm training my model and I set the batch_size, image_batch, video_batch all equal to 16 but I met that problem. The problem occurs at:
File "/home/ydj/MoCoGAN/trainers.py", line 268, in train
self.video_batch_size, use_categories=self.use_categories)
File "/home/ydj/MoCoGAN/trainers.py", line 180, in train_discriminator
l_discriminator += self.category_criterion(real_categorical.squeeze(), categories_gt.long())
Is anyone know how to solve that?

code available?

RuntimeError: arguments are located on different GPUs

Hi, I was running your code on my GPUs but the error occurred. I tried to set one GPU but the problem still exists. I was wondering if you know how to solve that?

Dataset Formatter

Is there any code that converts an mp4 (or other video format) to the 2-dimensional jpgs that are input for training that could be included or referenced?

Video generation error

Hi, after getting this error when executing the generate_videos.py script:

root@508aee39a995:/mocogan/src# python generate_videos.py ../logs/dances/generator_100000.pytorch ../output 
Traceback (most recent call last):
  File "generate_videos.py", line 61, in <module>
    v, _ = generator.sample_videos(1, int(args['--number_of_frames']))
  File "/mocogan/src/models.py", line 268, in sample_videos
    z, z_category_labels = self.sample_z_video(num_samples, video_len)
  File "/mocogan/src/models.py", line 259, in sample_z_video
    z_motion = self.sample_z_m(num_samples, video_len)
  File "/mocogan/src/models.py", line 224, in sample_z_m
    h_t.append(self.recurrent(e_t, h_t[-1]))
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/rnn.py", line 682, in forward
    self.bias_ih, self.bias_hh,
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/rnn.py", line 49, in GRUCell
    gi = F.linear(input, w_ih)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 555, in linear
    output = input.matmul(weight.t())
  File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 560, in matmul
    return torch.matmul(self, other)
  File "/usr/local/lib/python2.7/dist-packages/torch/functional.py", line 173, in matmul
    return torch.mm(tensor1, tensor2)
  File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 579, in mm
    return Addmm.apply(output, self, matrix, 0, 1, True)
  File "/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/blas.py", line 26, in forward
    matrix1, matrix2, out=output)
TypeError: torch.addmm received an invalid combination of arguments - got (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor), but expected one of:
 * (torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
 * (torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
 * (float beta, torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
 * (torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
 * (float beta, torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
 * (torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
 * (float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
      didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)
 * (float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
      didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)

Any ideas?

Has the Taichi dataset been published?

Thanks for sharing this work, I wonder if the Taichi dataset has been published? if so, where can I download it?

Invariable Image Size

I have been working with the model, and I am trying to generate images of size 128x128. I changed the --image_size option to 128. For reference, here is the full command.

$ python3 train.py  \
      --image_batch 32 \
      --video_batch 32 \
      --use_noise \
      --noise_sigma 0.1 \
      --image_discriminator PatchImageDiscriminator \
      --video_discriminator PatchVideoDiscriminator \
      --print_every 100 \
      --every_nth 2 \
      --dim_z_content 50 \
      --dim_z_motion 10 --image_size 128 \
      ../data/fb-128 ../logs/fb-2

The initial output is the following, which verifies that the option was acknowledged by the program.

{'--batches': '100000',
 '--dim_z_category': '6',
 '--dim_z_content': '50',
 '--dim_z_motion': '10',
 '--every_nth': '2',
 '--image_batch': '32',
 '--image_dataset': '',
 '--image_discriminator': 'PatchImageDiscriminator',
 '--image_size': '128',
 '--n_channels': '3',
 '--noise_sigma': '0.1',
 '--print_every': '100',
 '--use_categories': False,
 '--use_infogan': False,
 '--use_noise': True,
 '--video_batch': '32',
 '--video_discriminator': 'PatchVideoDiscriminator',
 '--video_length': '16',
 '<dataset>': '../data/fb-128',
 '<log_folder>': '../logs/fb-2'}

The program then runs, but doesn't produce images of size 128x128 and continues to create images of size 64x64. Additionally, saved models show no increase in size, contrary to the expected increase in response to a larger output size. I have traced the bug to the model definitions, specifically the following lines.

self.main = nn.Sequential(
            nn.ConvTranspose2d(dim_z, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            nn.ConvTranspose2d(ngf * 2, ngf, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf),
            nn.ReLU(True),
            nn.ConvTranspose2d(ngf, self.n_channels, 4, 2, 1, bias=False),
            nn.Tanh()
        )

Note: this is only the generator definition. I would expect that each of the discriminators would also need a change analogous to what might help here.

I have tried several different resolution methods including changing n_channels, but to no avail. I just can't seem to find the point from which the arbitrary size of 64x64 originates. That does exclude that 64 is the product of 8, 4, 2, and 1, which are the coefficients of ngf in each output size, but I don't see how that affects the final output size of n_channels.

Although I would love to see a fix, if anybody knows where 64x64 comes from, I can do more poking, and probably find a solution myself.

about Taichi dataset

Thanks a lot for your amazing work. Will you make the Taichi dataset public?

Unwanted repeatition of the action in the Generated videos

First off, thanks for the awesome paper and also for the well-maintained code!

How can I avoid repetition in the result/generated videos and make sure they are not going to be squeezed (and repeated) during the whole time span that I have in the original videos. FYI, I have the videos of the 256 frames but when I work with different settings of video_length I see repetition or other inconsistency unless I set video_length equal to 256 which ends up to be very costly
Very thanks in advance

Issue on - executing (MoCoGAN Paper)

Hi,
I am using - Python 3.6.1 :: Anaconda custom (64-bit) and Ubuntu 14.04.5 LTS.
When i am trying to exexute the code at - https://github.com/sergeytulyakov/mocogan
and steps : https://github.com/sergeytulyakov/mocogan/wiki/Training-MoCoGAN

I am getting below error while executing the code. Would you please help ?

Training:

executed below command from command line
python train.py
--image_batch 32
--video_batch 32
--use_infogan
--use_noise
--noise_sigma 0.1
--image_discriminator PatchImageDiscriminator
--video_discriminator CategoricalVideoDiscriminator
--print_every 100
--every_nth 2
--dim_z_content 50
--dim_z_motion 10
--dim_z_category 4
../data/actions ../logs/actions
###############################################################
shiba@shiba:~/Downloads/mocogan-master/src$ python train.py \

--image_batch 32 \
--video_batch 32 \
--use_infogan \
--use_noise \
--noise_sigma 0.1 \
--image_discriminator PatchImageDiscriminator \
--video_discriminator CategoricalVideoDiscriminator \
--print_every 100 \
--every_nth 2 \
--dim_z_content 50 \
--dim_z_motion 10 \
--dim_z_category 4 \
../data/actions ../logs/actions

/home/shiba/anaconda3/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
return f(*args, **kwds)
{'--batches': '100000',
'--dim_z_category': '4',
'--dim_z_content': '50',
'--dim_z_motion': '10',
'--every_nth': '2',
'--image_batch': '32',
'--image_dataset': '',
'--image_discriminator': 'PatchImageDiscriminator',
'--image_size': '64',
'--n_channels': '3',
'--noise_sigma': '0.1',
'--print_every': '100',
'--use_categories': False,
'--use_infogan': True,
'--use_noise': True,
'--video_batch': '32',
'--video_discriminator': 'CategoricalVideoDiscriminator',
'--video_length': '16',
'': '../data/actions',
'<log_folder>': '../logs/actions'}
Traceback (most recent call last):
File "train.py", line 104, in
dataset = data.VideoFolderDataset(args[''], cache=os.path.join(args[''], 'local.db'))
File "/home/shiba/Downloads/mocogan-master/src/data.py", line 24, in init
self.images, self.lengths = pickle.load(f)
TypeError: a bytes-like object is required, not 'str'
*** Error in `python': double free or corruption (!prev): 0x0000000000bfda20 ***
Aborted (core dumped)
shiba@shiba:~/Downloads/mocogan-master/src$

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

RuntimeError: arguments are located on different GPUs

Same issue #30
Is there any new progress about this question?

Action-conditional video generator inconsistent with the paper claim?

Hello,

I am reading through the paper and the code. In paper, Section 3.1 claimed that if the video generation is conditioned on action category, the input to GRU should be a concatenation of one-hot action vector and a random vector. However in src/models.py Line 261,

mocogan/src/models.py

Line 261 in d59b5b7

z = torch.cat([z_content, z_category, z_motion], dim=1)

it seems that you just combine the one-hot action vector with the output of GRU instead of feeding action vector as an input to GRU.

I am very confused with this inconsistency. If my understanding is not right, feel free to correct me.

Discriminators structures are not same as in paper?

Hi, I've read the MocoGAN paper and in the supplementary materials and it said that stride and padding of 3D Conv in Dis_v are "equal for all the dimensions in each layer", which is obviously different from your code. So which one is the actual one you used in your paper?

About evaluation code

Hi,

Thanks for your nice work!
I wonder whether the code for FID, AVD calculation be released or be found somewhere else?

Best.

Image-to-video Translation

Is the code for Image-to-video available in this repository?

Issue on - executing (MoCoGAN Paper)

I am getting below error while executing the code. Would you please help ?

Training:

executed below command from command line
python train.py
--image_batch 32
--video_batch 32
--use_infogan
--use_noise
--noise_sigma 0.1
--image_discriminator PatchImageDiscriminator
--video_discriminator CategoricalVideoDiscriminator
--print_every 100
--every_nth 2
--dim_z_content 50
--dim_z_motion 10
--dim_z_category 4
../data/actions ../logs/actions
###############################################################
shiba@shiba:~/Downloads/mocogan-master/src$ python train.py \

--image_batch 32 \
--video_batch 32 \
--use_infogan \
--use_noise \
--noise_sigma 0.1 \
--image_discriminator PatchImageDiscriminator \
--video_discriminator CategoricalVideoDiscriminator \
--print_every 100 \
--every_nth 2 \
--dim_z_content 50 \
--dim_z_motion 10 \
--dim_z_category 4 \
../data/actions ../logs/actions

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

Video generation ffmpeg error

Hi,

I get the following ffmpeg error when trying to generate a video.

ffmpeg version 2.8.11-0ubuntu0.16.04.1 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.4) 20160609
  configuration: --prefix=/usr --extra-version=0ubuntu0.16.04.1 --build-suffix=-ffmpeg --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --cc=cc --cxx=g++ --enable-gpl --enable-shared --disable-stripping --disable-decoder=libopenjpeg --disable-decoder=libschroedinger --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzvbi --enable-openal --enable-opengl --enable-x11grab --enable-libdc1394 --enable-libiec61883 --enable-libzmq --enable-frei0r --enable-libx264 --enable-libopencv
  libavutil      54. 31.100 / 54. 31.100
  libavcodec     56. 60.100 / 56. 60.100
  libavformat    56. 40.101 / 56. 40.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 40.101 /  5. 40.101
  libavresample   2.  1.  0 /  2.  1.  0
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.101 /  1.  2.101
  libpostproc    53.  3.100 / 53.  3.100
Input #0, rawvideo, from 'pipe:':
  Duration: N/A, start: 0.000000, bitrate: 786 kb/s
    Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 64x64, 786 kb/s, 8 tbr, 8 tbn, 8 tbc
[swscaler @ 0x751940] deprecated pixel format used, make sure you did set range correctly
[gif @ 0x740420] GIF muxer supports only a single video GIF stream.
Output #0, gif, to '../data/0.gif':
  Metadata:
    encoder         : Lavf56.40.101
    Stream #0:0: Video: mjpeg, yuvj444p(pc), 64x64, q=2-31, 200 kb/s, 8 fps, 8 tbn, 8 tbc
    Metadata:
      encoder         : Lavc56.60.100 mjpeg
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> mjpeg (native))
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument

Output size is too small.

Hi,
Thank you so much for your work. I have tried to run the mocogan with my own data. I have prepared for the data as the format in 'actions' folder, only make it with length of 8. But when I try to run it I encounter with some problems as followings. I am a bit confused that whether I have missed something?

Traceback (most recent call last):
File "train.py", line 133, in
trainer.train(generator, image_discriminator, video_discriminator)
File "/data2/Runze/mocogan/src/trainers.py", line 271, in train
self.video_batch_size, use_categories=self.use_categories)
File "/data2/Runze/mocogan/src/trainers.py", line 170, in train_discriminator
real_labels, real_categorical = discriminator(batch)
File "/home/runzeli/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/data2/Runze/mocogan/src/models.py", line 180, in forward
h, _ = super(CategoricalVideoDiscriminator, self).forward(input)
File "/data2/Runze/mocogan/src/models.py", line 162, in forward
h = self.main(input).squeeze()
File "/home/runzeli/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/runzeli/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/home/runzeli/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/runzeli/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 388, in forward
self.padding, self.dilation, self.groups)
File "/home/runzeli/anaconda3/envs/python27/lib/python2.7/site-packages/torch/nn/functional.py", line 126, in conv3d
return f(input, weight, bias)
RuntimeError: Given input size: (128, 2, 16, 16). Calculated output size: (1, -1, 8, 8). Output size is too small.

Here is the command I run:

python train.py
--image_batch 8
--video_batch 1
--use_infogan
--use_noise
--noise_sigma 0.1
--image_discriminator PatchImageDiscriminator
--video_discriminator CategoricalVideoDiscriminator
--print_every 100
--every_nth 1
--dim_z_content 50
--dim_z_motion 8
--dim_z_category 1
../data/actions ../logs/actions

Thank you. :)

Problem about batch size and video length

Hello!
I have some questions. When I set the image_batch, video_batch, and video_length both equal to 16, it works well. But when I set them to 48, it occurs:

Traceback (most recent call last):
File "train.py", line 133, in
trainer.train(generator, image_discriminator, video_discriminator)
File "mocogan/src/trainers.py", line 273, in train
opt_generator)
File "mocogan/src/trainers.py", line 212, in train_generator
l_generator += self.category_criterion(fake_categorical.squeeze(), generated_categories)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/loss.py", line 601, in forward
self.ignore_index, self.reduce)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1140, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1053, in nll_loss
raise ValueError('Expected 2 or 4 dimensions (got {})'.format(dim))
ValueError: Expected 2 or 4 dimensions (got 3)

Do the batch size number and video length have a relation? Should I set them same?
Thanks~

Got segmentation fault(core dumped) when tried to run train.py

I've not used docker and installed dependencies by pip as instructed in the wiki.
Actually, the error is coming in loggers file while importing tensorflow.
I found out this while debugging this error. Please provide a solution @sergeytulyakov
Segmentation fault (core dumped)
train.py calls trainers.py which calls loggers.py.
#6

results on UCF101

Hi, thanks for this great work. Is it possible for you to provide the generated videos on UCF101 ?

Has the Taichi dataset been published?

Thanks for sharing this work, I wonder if the Taichi dataset has been published? if so, where can I download it?

Pre trained Model

Hi,
Thank you fro sharing this. Can you please help us with pre-trained model ?

Problem in running generate_videos.py

After doing the ffmpeg thing as told by you in the issue referred above, I'm getting this error

Attached screenshot of the error
https://drive.google.com/file/d/13AuoobWDDfAEC4yNQQpiRVllvu-NRjwt/view?usp=sharing
@sergeytulyakov Please help.

got segmentation fault when tried to run train .py

I follow the steps from wiki and build the environment manually. I am not using docker. I got segmentation fault (core dump) error while running
this command
python train.py
--image_batch 32
--video_batch 32
--use_infogan
--use_noise
--noise_sigma 0.1
--image_discriminator PatchImageDiscriminator
--video_discriminator CategoricalVideoDiscriminator
--print_every 100
--every_nth 2
--dim_z_content 50
--dim_z_motion 10
--dim_z_category 4
../data/actions ../logs/actions

Can you please help me regard this

Zero length video generation

I tried implementing the pytorch implementation for generation of videos. I do not get any error but the generated videos are of zero second. It means that no video is being generated yet but I do not get any error as well. What could be the cause behind this issue?

Could the code for generating the shape motion dataset also be released?

Hello,
This is such a nice job. The codes are clean and easy to follow.
Recently, I'm working on video disentangling. Could I have the code for generating the shape motion dataset? Thanks a lot!
Dong

Generate Video Error

I was trying to generate some videos using the generated_video script, and I was encountering the following error:

I don't know why this error will happen in testing, not in training? We generate fake video clips in training as well, no? Why am I encountering this error at test stage?

Can anyone tell me why this would happen? Thanks!

Traceback (most recent call last): File "/home/eric/mocogan/src/generate_videos.py", line 61, in <module> v, _ = generator.sample_videos(1, int(args['--number_of_frames']))
File "/home/eric/mocogan/src/models.py", line 285, in sample_videos z, z_category_labels = self.sample_z_video(num_samples, video_len)
File "/home/eric/mocogan/src/models.py", line 273, in sample_z_video z_motion = self.sample_z_m(num_samples, video_len)
File "/home/eric/mocogan/src/models.py", line 227, in sample_z_m h_t.append(self.recurrent(e_t, h_t[-1]))
File "/home/eric/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__ result = self.forward(*input, **kwargs)
File "/home/eric/anaconda3/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 729, in forward self.bias_ih, self.bias_hh,
File "/home/eric/anaconda3/lib/python3.6/site-packages/torch/nn/_functions/rnn.py", line 50, in GRUCell gi = F.linear(input, w_ih)
File "/home/eric/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 837, in linear output = input.matmul(weight.t())
File "/home/eric/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 386, in matmul return torch.matmul(self, other)
File "/home/eric/anaconda3/lib/python3.6/site-packages/torch/functional.py", line 174, in matmul return torch.mm(tensor1, tensor2)
RuntimeError: Expected object of type Variable[torch.cuda.FloatTensor] but found type Variable[torch.FloatTensor] for argument #1 'mat2'

A Question regarding generate_videos.py

Dear the author of MocoGAN:

I am deeply impressed about your fantastic work.
And I really appreciate that you've opened the source code of this project.

I have a small problem when using generate_video.py file.
After I trained the model and run,

"python generate_videos.py --num_videos 10 --output_format gif --number_of_frames 16 ../logs/actions/generator_21700.pytorch output"

The following error occurrs:

Traceback (most recent call last):
File "generate_videos.py", line 61, in
v, _ = generator.sample_videos(1, int(args['--number_of_frames']))
File "/mocogan/src/models.py", line 268, in sample_videos
z, z_category_labels = self.sample_z_video(num_samples, video_len)
File "/mocogan/src/models.py", line 259, in sample_z_video
z_motion = self.sample_z_m(num_samples, video_len)
File "/mocogan/src/models.py", line 224, in sample_z_m
h_t.append(self.recurrent(e_t, h_t[-1]))
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/rnn.py", line 682, in forward
self.bias_ih, self.bias_hh,
File "/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/rnn.py", line 49, in GRUCell
gi = F.linear(input, w_ih)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 555, in linear
output = input.matmul(weight.t())
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 560, in matmul
return torch.matmul(self, other)
File "/usr/local/lib/python2.7/dist-packages/torch/functional.py", line 173, in matmul
return torch.mm(tensor1, tensor2)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 579, in mm
return Addmm.apply(output, self, matrix, 0, 1, True)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/blas.py", line 26, in forward
matrix1, matrix2, out=output)
TypeError: torch.addmm received an invalid combination of arguments - got (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor), but expected one of:

(torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
(torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
(float beta, torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
(torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
(float beta, torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
(torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
(float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)
(float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)

I think there must be some mistakes I made, but could you look into it give me any clue?