Giter Site home page Giter Site logo

single_video_generation's Issues

AssertionError in run_generation.py

Hello, I'm having some problems trying to run the code.

First of all I ran into a device issue in utils/resize_right.py. I had to add a set device in get_field_of_view:

mirror = fw_cat((fw.arange(in_sz), fw.arange(in_sz - 1, -1, step=-1)), fw)
mirror = fw_set_device(mirror, projected_grid.device, fw) # <-- avoids device error in the following line
field_of_view = mirror[fw.remainder(field_of_view, mirror.shape[0])]

Now I'm running into an AssertionError related to the temporal dimension:

Namespace(gpu='0', results_dir='output/', frames_dir='data/', start_frame=1, end_frame=15, max_size=144, min_size=(3, 15), downfactor=(0.85, 0.85), J=5, J_start_from=1, kernel_size=(3, 7, 7), sthw=(0.5, 1, 1), reduce='median', vgpnn_type='pm', use_noise=True, noise_amp=5, verbose=True, save_intermediate=True, save_intermediate_path='output/')
Traceback (most recent call last):
  File "/home/hans/code/vgpnn/run_generation.py", line 82, in <module>
    VGPNN, orig_vid = vgpnn.get_vgpnn(
  File "/home/hans/code/vgpnn/vgpnn.py", line 293, in get_vgpnn
    assert (
AssertionError: smallest pyramid level has less frames 2 than temporal kernel-size 3. You may want to increase min_size of the temporal dimension

I'm using all the default settings. I've only set --frames_dir and --frames_dir. My video is square about 300 frames long (but the default settings only look at the first 15).

I've also tried setting --min_size 4,15 but ran into exactly the same error.

Any idea what could be going wrong?

run_generation.py stops with an error

When I run it on the ballet example like this:

python3 run_generation.py --frames_dir=/home/frank/tmp/ballet/ballet_Wz_f9B4pPtg --start_frame=1 --end_frame=60

Then I get this error:

Namespace(gpu='0', results_dir='./results/generation', frames_dir='/home/frank/tmp/ballet/ballet_Wz_f9B4pPtg', start_frame=1, end_frame=60, max_size=144, min_size=(3, 15), downfactor=(0.85, 0.85), J=5, J_start_from=1, kernel_size=(3, 7, 7), sthw=(0.5, 1, 1), reduce='median', vgpnn_type='pm', use_noise=True, noise_amp=5, verbose=True, save_intermediate=True, save_intermediate_path='./results/generation')
/home/frank/.local/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
Traceback (most recent call last):
  File "/home/frank/tmp/single_video_generation/run_generation.py", line 70, in <module>
    VGPNN, orig_vid = vgpnn.get_vgpnn(
                      ^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/vgpnn.py", line 215, in get_vgpnn
    orig_vid = read_original_video(o.frames_dir, o.start_frame, o.end_frame, o.max_size, o.device, verbose=verbose, ext=ext)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/main_utils.py", line 48, in read_original_video
    frames = read_frames(frames_dir, start_frame, end_frame, resizer, device=device, verbose=verbose, ext=ext)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/main_utils.py", line 35, in read_frames
    x = frame_resizer(x)
        ^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/resize_right.py", line 71, in resize
    field_of_view, weights = prepare_weights_and_field_of_view_1d(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/resize_right.py", line 165, in prepare_weights_and_field_of_view_1d
    field_of_view = get_field_of_view(projected_grid, cur_support_sz, in_sz,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/frank/tmp/single_video_generation/utils/resize_right.py", line 284, in get_field_of_view
    field_of_view = mirror[fw.remainder(field_of_view, mirror.shape[0])]
                    ~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

Thank you very much for your work. While running your code, I encountered some problems. May I know how to solve them?

`(vgpnn) C:\ai\single_video_generation-main>python run_generation.py --gpu 0 --frames_dir data/airballoons_QGAMTlI6XxY --start_frame 66 --end_frame 80 --max_size 360 --use_noise False --min_size '(6,40)'
Namespace(J=5, J_start_from=1, downfactor=(0.85, 0.85), end_frame=80, frames_dir='data/airballoons_QGAMTlI6XxY', gpu='0', kernel_size=(3, 7, 7), max_size=360, min_size='(6,40)', noise_amp=5, reduce='median', results_dir='./results/generation', save_intermediate=True, save_intermediate_path='./results/generation', start_frame=66, sthw=(0.5, 1, 1), use_noise=False, verbose=True, vgpnn_type='pm')
Traceback (most recent call last):
File "run_generation.py", line 70, in
VGPNN, orig_vid = vgpnn.get_vgpnn(
File "C:\ai\single_video_generation-main\vgpnn.py", line 217, in get_vgpnn
downscales, upscale_factors, out_shapes = scale_utils.get_scales_out_shapes(T, H, W, o.downfactor, o.min_size)
File "C:\ai\single_video_generation-main\utils\scale_utils.py", line 48, in get_scales_out_shapes
assert T >= min_size[0], f"min_size ({min_size[0]},{min_size[1]}) larger than original size ({T},{H},{W}) (it must be smaller)"
TypeError: '>=' not supported between instances of 'int' and 'str'

(vgpnn) C:\ai\single_video_generation-main>python run_generation.py --gpu 0 --frames_dir data/airballoons_QGAMTlI6XxY --start_frame 6 --end_frame 80 --use_noise False
Namespace(J=5, J_start_from=1, downfactor=(0.85, 0.85), end_frame=80, frames_dir='data/airballoons_QGAMTlI6XxY', gpu='0', kernel_size=(3, 7, 7), max_size=144, min_size=(3, 15), noise_amp=5, reduce='median', results_dir='./results/generation', save_intermediate=True, save_intermediate_path='./results/generation', start_frame=6, sthw=(0.5, 1, 1), use_noise=False, verbose=True, vgpnn_type='pm')
Traceback (most recent call last):
File "run_generation.py", line 70, in
VGPNN, orig_vid = vgpnn.get_vgpnn(
File "C:\ai\single_video_generation-main\vgpnn.py", line 226, in get_vgpnn
assert ret_pyr[0].shape[2] >= o.kernel_size[0], f'smallest pyramid level has less frames {ret_pyr[0].shape[2]} than temporal kernel-size {o.kernel_size[0]}. You may want to increase min_size of the temporal dimension'
AssertionError: smallest pyramid level has less frames 2 than temporal kernel-size 3. You may want to increase min_size of the temporal dimension`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.