mcg-nku / amt Goto Github PK
View Code? Open in Web Editor NEWOfficial code for "AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation" (CVPR2023)
Home Page: https://nk-cs-zzl.github.io/projects/amt/index.html
License: Other
Official code for "AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation" (CVPR2023)
Home Page: https://nk-cs-zzl.github.io/projects/amt/index.html
License: Other
Hi,
I encountered an issue when I want to prepare the optical flow data with Vimeo dataset.
Specifically, I downloaded the Vimeo dataset (vimeo_septuplet.zip) and unzipped (including rename) it as /mnt/data_nas/srwang/vimeo/vimeo_triplet/{readme.txt; sequences; tri_testlist.txt; tri_trainlist.txt}.
Following the illustration in https://github.com/MCG-NKU/AMT/blob/main/docs/develop.md to prepare the environment, I ran python flow_generation/gen_flow.py -r /mnt/data_nas/srwang/vimeo/vimeo_triplet. There is an error called "getopt.GetoptError: option -r not recognized".
The detailed snapshot for the bug is attached.
Thanks for your time and looking forward to your reply.
btw this check of device==cuda
isn't working - but if I change the line in bold, it works
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('using device =', device)
cfg_path = args.config
ckpt_path = args.ckpt
input_path = args.input
out_path = args.out_path
iters = int(args.niters)
frame_rate = int(args.frame_rate)
save_images = args.save_images
print('using device =', device)
**if device != 'cpu':**
print('using cuda mode')
anchor_resolution = 1024 * 512
anchor_memory = 1500 * 1024**2
anchor_memory_bias = 2500 * 1024**2
vram_avail = torch.cuda.get_device_properties(device).total_memory
print("VRAM available: {:.1f} MB".format(vram_avail / 1024 ** 2))
else:
print('using cpu mode')
# Do not resize in cpu mode
anchor_resolution = 8192*8192
anchor_memory = 1
anchor_memory_bias = 0
vram_avail = 1
Why not train an arbitrary time model on the Septuple set of Vimeo90K, which is larger than GOPRO?
I did some tests and the arbitrary time model trained on GOPRO does not perform well.
Thanks for your wonderful work, and i want to know the meaning of Avg. flow which is written in the caption of fig6. Moreover, the last decoder results flows with the channel number of 3 in AMT-S, but in the process of visualizing optical flow, the channel number are needed to be 2, how do you deal with that?
looking forward to your reply, thanks again
Very nice work!
May I know how long it costs to train the proposed method on the Vimeo dataset?
For example, with two 3090 gpus.
Bests and many thanks
Getting this error when running last cell in colab:
#@title # 3. Enjoy the Smooth Video import PIL.Image if not hasattr(PIL.Image, 'Resampling'): # Pillow < 9.0 PIL.Image.Resampling = PIL.Image model_type_upper = model_type.upper() !python3 '/content/AMT/demos/demo_2x.py' \ --config cfgs/{model_type_upper}.yaml \ --ckpt pretrained/{model_type}.pth \ --out_path {outputs_dir} \ --frame_rate {output_video_fps} from mediapy import read_video, show_video video = read_video(f'{outputs_dir}/demo.avi') show_video(video, fps=output_video_fps)
I alo had to change the name of the python file to run, to demo_2x.py, the once originally written is I believe a typo.
I would like to know if the AMT algorithm interpolates the video by specifying the output FPS rather than by specifying the output -n value?
First of all, thank you for your excellent work!
While reading your paper, I noticed that you mentioned your FLOPs calculation is based on 720p frames (1280x720). I ran some tests on 720p frames using thop
, and the results I obtained were slightly different from those in your paper. I was wondering if there was any mistake in my codes:
import sys
sys.path.append('.')
import torch
import argparse
from omegaconf import OmegaConf
from thop import profile
from thop import clever_format
from utils.build_utils import build_from_cfg
parser = argparse.ArgumentParser(
prog = 'AMT',
description = 'Speed¶meter benchmark',
)
parser.add_argument('-c', '--config', default='cfgs/AMT-L.yaml')
parser.add_argument('--H', default=256, type=int)
parser.add_argument('--W', default=256, type=int)
args = parser.parse_args()
cfg_path = args.config
network_cfg = OmegaConf.load(cfg_path).network
network_name = network_cfg.name
model = build_from_cfg(network_cfg)
model = model.cuda()
model.eval()
img0_720p = torch.randn(1, 3, 720, 1280).cuda()
img1_720p = torch.randn(1, 3, 720, 1280).cuda()
embt = torch.tensor(1/2).float().view(1, 1, 1, 1).cuda()
flops, params = profile(model, inputs=(img0_720p, img1_720p, embt, True))
flops, params = clever_format([flops, params], "%.3f")
print('(THOP) 720p Flops: ', flops)
print('(THOP) 720p Params: ', params)
As a result, I got:
For AMT-S: 0.12T v.s. 0.12T in your paper
For AMT-L: 0.66T v.s. 0.58T in your paper
For AMT-G: 2.24T v.s. 2.07T in your paper
Could you please check my code for any possible mistakes? Also, do you have a FLOPs calculation code you could share? Thank you very much in advance!
Hello,
How long does it take to train the model?
Thank you
thank you very much for your AMT !!!
when i run (python demos/demo_2x.py ...) , get error
torch.cuda.OutOfMemoryError: CUDA out of memory
Loading [images] from [['image\\panda_0.png', 'image\\panda_1.png']], the number of images = [2]
anchor_resolution 67108864
Loading [networks.AMT-G.Model] from [pretrained/amt-g.pth]...
Start frame interpolation:
Iter 1. input_frames=2 output_frames=3
Traceback (most recent call last):
File "D:\Software\AI\AMT\2304\amt_1\demos\demo_2x.py", line 190, in <module>
imgt_pred = model(in_0, in_1, embt, scale_factor=scale, eval=True)['imgt_pred']
File "D:\Program\conda\envs\py39\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Software\AI\AMT\2304\amt_1\.\networks\AMT-G.py", line 86, in forward
corr_fn = BidirCorrBlock(fmap0, fmap1, radius=self.radius, num_levels=self.corr_levels)
File "D:\Software\AI\AMT\2304\amt_1\.\networks\blocks\raft.py", line 161, in __init__
corr_T = F.avg_pool2d(corr_T, 2, stride=2)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.46 GiB (GPU 0; 8.00 GiB total capacity; 6.48 GiB already allocated; 0 bytes free; 6.50 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Thanks for your work!
I wonder if how to interpolate from 16fps to 24fps
rm: cannot remove 'results': No such file or directory
Loading [images] from [['assets/quick_demo/a.JPG', 'assets/quick_demo/b.JPG']], the number of images = [2]
Traceback (most recent call last):
File "/content/AMT/demos/demo_2x.py", line 104, in
inputs = [img2tensor(read(img_path)).to(device) for img_path in input_path]
File "/content/AMT/demos/demo_2x.py", line 104, in
inputs = [img2tensor(read(img_path)).to(device) for img_path in input_path]
File "/content/AMT/./utils/utils.py", line 106, in read
else: raise Exception('don't know how to read %s' % file)
Exception: don't know how to read assets/quick_demo/a.JPG
1132 if not pathlib.Path(path).is_file():
-> 1133 raise RuntimeError(f"Video file '{path}' is not found.")
1134 command = [
1135 _get_ffmpeg_path(),
RuntimeError: Video file 'results/demo_0000.mp4' is not found.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.