Giter Site home page Giter Site logo

mcg-nku / amt Goto Github PK

View Code? Open in Web Editor NEW
211.0 6.0 17.0 15.21 MB

Official code for "AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation" (CVPR2023)

Home Page: https://nk-cs-zzl.github.io/projects/amt/index.html

License: Other

Python 99.75% Shell 0.25%
video backward-warp cvpr2023 frame-interpolation optical-flow slomo video-generation

amt's People

Contributors

nefujoeychen avatar nk-cs-zzl avatar paper99 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

amt's Issues

Error in preparing the optical flow for re-training the model with Vimeo dataset, when run "python flow_generation/gen_flow.py -r data/vimeo_triplet"

Hi,
I encountered an issue when I want to prepare the optical flow data with Vimeo dataset.
Specifically, I downloaded the Vimeo dataset (vimeo_septuplet.zip) and unzipped (including rename) it as /mnt/data_nas/srwang/vimeo/vimeo_triplet/{readme.txt; sequences; tri_testlist.txt; tri_trainlist.txt}.
Following the illustration in https://github.com/MCG-NKU/AMT/blob/main/docs/develop.md to prepare the environment, I ran python flow_generation/gen_flow.py -r /mnt/data_nas/srwang/vimeo/vimeo_triplet. There is an error called "getopt.GetoptError: option -r not recognized".
The detailed snapshot for the bug is attached.
Thanks for your time and looking forward to your reply.

image

The huggingface demo is down

Hello:
The huggingface demo is down. The error message is "Runtime error Memory limit exceeded (16Gi)".

image

By the way, I would be grateful if you could make AMT into an installable pip package.

Cuda available but still not using cuda

btw this check of device==cuda isn't working - but if I change the line in bold, it works

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('using device =', device)
cfg_path = args.config
ckpt_path = args.ckpt
input_path = args.input
out_path = args.out_path
iters = int(args.niters)
frame_rate = int(args.frame_rate)
save_images = args.save_images
print('using device =', device)
**if device != 'cpu':**
    print('using cuda mode')
    anchor_resolution = 1024 * 512
    anchor_memory = 1500 * 1024**2
    anchor_memory_bias = 2500 * 1024**2
    vram_avail = torch.cuda.get_device_properties(device).total_memory
    print("VRAM available: {:.1f} MB".format(vram_avail / 1024 ** 2))
else:
    print('using cpu mode')
    # Do not resize in cpu mode
    anchor_resolution = 8192*8192
    anchor_memory = 1
    anchor_memory_bias = 0
    vram_avail = 1

Green screen only

image

with n = 1 and r = 8, I get just a video of this green screen. Any idea why?

Question about arbitrary time model

Why not train an arbitrary time model on the Septuple set of Vimeo90K, which is larger than GOPRO?
I did some tests and the arbitrary time model trained on GOPRO does not perform well.

the means of Avg. flow in fig.6 of paper

Thanks for your wonderful work, and i want to know the meaning of Avg. flow which is written in the caption of fig6. Moreover, the last decoder results flows with the channel number of 3 in AMT-S, but in the process of visualizing optical flow, the channel number are needed to be 2, how do you deal with that?

looking forward to your reply, thanks again

error

when I run :python flow_generation/gen_flow.py -r data/vimeo_triplet
there is a error :
image

The training time cost

Very nice work!

May I know how long it costs to train the proposed method on the Vimeo dataset?
For example, with two 3090 gpus.

Bests and many thanks

NameError: name 'anchor_resolution' is not defined

Getting this error when running last cell in colab:

#@title # 3. Enjoy the Smooth Video import PIL.Image if not hasattr(PIL.Image, 'Resampling'): # Pillow < 9.0 PIL.Image.Resampling = PIL.Image model_type_upper = model_type.upper() !python3 '/content/AMT/demos/demo_2x.py' \ --config cfgs/{model_type_upper}.yaml \ --ckpt pretrained/{model_type}.pth \ --out_path {outputs_dir} \ --frame_rate {output_video_fps} from mediapy import read_video, show_video video = read_video(f'{outputs_dir}/demo.avi') show_video(video, fps=output_video_fps)

I alo had to change the name of the python file to run, to demo_2x.py, the once originally written is I believe a typo.

Specifying the output FPS

I would like to know if the AMT algorithm interpolates the video by specifying the output FPS rather than by specifying the output -n value?

About FLOPs calculation

First of all, thank you for your excellent work!
While reading your paper, I noticed that you mentioned your FLOPs calculation is based on 720p frames (1280x720). I ran some tests on 720p frames using thop, and the results I obtained were slightly different from those in your paper. I was wondering if there was any mistake in my codes:

import sys
sys.path.append('.')
import torch
import argparse
from omegaconf import OmegaConf
from thop import profile
from thop import clever_format
from utils.build_utils import build_from_cfg

parser = argparse.ArgumentParser(
                prog = 'AMT',
                description = 'Speed&parameter benchmark',
                )
parser.add_argument('-c', '--config', default='cfgs/AMT-L.yaml')
parser.add_argument('--H', default=256, type=int)
parser.add_argument('--W', default=256, type=int)
args = parser.parse_args()

cfg_path = args.config
network_cfg = OmegaConf.load(cfg_path).network
network_name = network_cfg.name
model = build_from_cfg(network_cfg)
model = model.cuda()
model.eval()
img0_720p = torch.randn(1, 3, 720, 1280).cuda()
img1_720p = torch.randn(1, 3, 720, 1280).cuda()
embt = torch.tensor(1/2).float().view(1, 1, 1, 1).cuda()

flops, params = profile(model, inputs=(img0_720p, img1_720p, embt, True))
flops, params = clever_format([flops, params], "%.3f")
print('(THOP) 720p Flops: ', flops)
print('(THOP) 720p Params: ', params)

As a result, I got:
For AMT-S: 0.12T v.s. 0.12T in your paper
For AMT-L: 0.66T v.s. 0.58T in your paper
For AMT-G: 2.24T v.s. 2.07T in your paper
Could you please check my code for any possible mistakes? Also, do you have a FLOPs calculation code you could share? Thank you very much in advance!

torch.cuda.OutOfMemoryError: CUDA out of memory

thank you very much for your AMT !!!

when i run (python demos/demo_2x.py ...) , get error
torch.cuda.OutOfMemoryError: CUDA out of memory

Loading [images] from [['image\\panda_0.png', 'image\\panda_1.png']], the number of images = [2]
anchor_resolution 67108864
Loading [networks.AMT-G.Model] from [pretrained/amt-g.pth]...
Start frame interpolation:
Iter 1. input_frames=2 output_frames=3
Traceback (most recent call last):
  File "D:\Software\AI\AMT\2304\amt_1\demos\demo_2x.py", line 190, in <module>
    imgt_pred = model(in_0, in_1, embt, scale_factor=scale, eval=True)['imgt_pred']
  File "D:\Program\conda\envs\py39\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\AMT\2304\amt_1\.\networks\AMT-G.py", line 86, in forward
    corr_fn = BidirCorrBlock(fmap0, fmap1, radius=self.radius, num_levels=self.corr_levels)
  File "D:\Software\AI\AMT\2304\amt_1\.\networks\blocks\raft.py", line 161, in __init__
    corr_T = F.avg_pool2d(corr_T, 2, stride=2)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.46 GiB (GPU 0; 8.00 GiB total capacity; 6.48 GiB already allocated; 0 bytes free; 6.50 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

error

rm: cannot remove 'results': No such file or directory
Loading [images] from [['assets/quick_demo/a.JPG', 'assets/quick_demo/b.JPG']], the number of images = [2]
Traceback (most recent call last):
File "/content/AMT/demos/demo_2x.py", line 104, in
inputs = [img2tensor(read(img_path)).to(device) for img_path in input_path]
File "/content/AMT/demos/demo_2x.py", line 104, in
inputs = [img2tensor(read(img_path)).to(device) for img_path in input_path]
File "/content/AMT/./utils/utils.py", line 106, in read
else: raise Exception('don't know how to read %s' % file)
Exception: don't know how to read assets/quick_demo/a.JPG
1132 if not pathlib.Path(path).is_file():
-> 1133 raise RuntimeError(f"Video file '{path}' is not found.")
1134 command = [
1135 _get_ffmpeg_path(),

RuntimeError: Video file 'results/demo_0000.mp4' is not found.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.