Giter Site home page Giter Site logo

dam's Introduction

Structure-Aware Motion Transfer with Deformable Anchor Model

Codes for CVPR 2022 paper Structure-Aware Motion Transfer with Deformable Anchor Model.

Environments

The model are trained on 4 Tesla V100 cards, pytorch vesion 1.6 and 1.8 with python 3.6 are tested fine. Basic installations are given in requiremetns.txt.

pip install -r requirements.txt

Datasets

TaiChiHD,Voxceleb1,FashionVideo,MGIF, all following FOMM. After downloading and pre-processing, the dataset should be placed in the ./data folder or you can change the parameter root_dir in the yaml config file. Note that we save the video dataset with png frames format (for example,./data/taichi-png/train/video-id/frames-id.png), for better training IO performance. All train and test video frames are specified in txt files in the ./data folder.

Checkpoints

Google drive Baiduyun passwd:z4ej

Training

We train the hdam model in two stages. Firstly we train dam, and detect the abnormal keypoints, the indexes of detected abnormal keypoints are written to the hdam config via the ignore_kp_list parameter. We then train hdam model with initialization of dam.

Train DAM

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 run.py --config config/dataset-dam.yaml

Train HDAM

CUDA_VISIBLE_DEVICES=0 python equivariance_detection.py --config config/dataset-dam.yaml --config_hdam config/dataset-hdam.yaml --checkpoint path/to/dam/model.pth
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 run.py --config config/dataset-hdam.yaml --checkpoint path/to/dam/model.pth

Evaluation

Evaluate video reconstruction with following command, for more metrics, we recommend to see FOMM-Pose-Evaluation.

CUDA_VISIBLE_DEVICES=0 python run.py --mode reconstruction --config path/to/config --checkpoint path/to/model.pth  

Demo

To make a demo animation, specify the driving video and source image, the result video will be saved to result.mp4.

python demo.py --config path/to/config --checkpoint path/to/model.pth --driving_video path/to/video.mp4 --source_image path/to/image.png --result_video path/to/result.mp4 --adapt_scale

E-commerce animation demo

We've made some applications in the e-commerce senario, which can be seen in the demo paper Move as You Like.

video video

Citation

@inproceedings{tao2022structure,
title={Structure-Aware Motion Transfer with Deformable Anchor Model},
author={Tao, Jiale and Wang, Biao and Xu, Borun and Ge, Tiezheng and Jiang, Yuning and Li, Wen and Duan, Lixin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={3637--3646},
year={2022}
}

@inproceedings{xu2021move,
title={Move As You Like: Image Animation in E-Commerce Scenario},
author={Xu, Borun and Wang, Biao and Tao, Jiale and Ge, Tiezheng and Jiang, Yuning and Li, Wen and Duan, Lixin},
booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
pages={2759--2761},
year={2021}
}

Acknowledgements

The implemetation is heavily borrowed from FOMM, we thank the author for the great efforts in this area.

dam's People

Contributors

jialetao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dam's Issues

Why produce smooth video results without the temporal consistency.

Hi,
First, thanks for your wonderful work and for making it open-source! I'm a little confused. Your method is ultimately about processing video, which should be converted frame by frame. But why can such methods produce relatively smooth video results without considering the temporal consistency of the video? Or why not model temporal consistency?

Git LFS Error

Hello, I was trying to install the dependencies and seems like there is an error with git LFS access and the file is not accessible. This is the error

batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.
error: failed to fetch some objects from 'https://github.com/JialeTao/DAM.git/info/lfs'

How Can I Get Keypoints?

First of all, thanks for this model and project.
How can I get keypoints and make visualization?

可以提供log文件吗?

作者您好,可以提供fashion数据集DAM的训练log文件吗?我自己训练L1结果一直不好,不知道哪里出了问题。

Can't run model on Google Colaboratory

Sorry for bothering... But for some reason I can't run your model on Google Colaborary...
When i try running this command line
!python demo.py --mode demo --config /content/DAM/config/mgif256-hdam.yaml --checkpoint /content/DAM/checkpoints/mgif-hdam.pth.tar --driving_video /content/DAM/demo/horse.gif --source_image /content/sample_data/212-2125063_horse-gif-google-search-horse-gif-running-animated.png --result_video /content/sample_data/result.mp4 --adapt_scale
I got an error:
demo.py: error: unrecognized arguments: --mode demo
Here is the link to my Google Colaborary
https://colab.research.google.com/drive/1j3wU5hKh2qMu8V4Ssd-zLF3KKfpuIpk0?usp=sharing
Thank you so much for helpingg!!

inference result is bad

hello, thanks for your work .

I tried the demo using my drive video(a mp4 with 10 seconds) and source image, both of them are upper body of human , but the effect was very poor. just as following shows:
image
here is my command for inference:
python demo.py --config ./config/fashion-dam.yaml --checkpoint ./ckpt/fashion-dam.pth --driving_video ./drive_vid/demo.mp4 --source_image ./src_img/demo.png --result_video ./output/result.mp4 --adapt_scale

Two questions about the code

Can I use a single image instead of a video as the driven input in your code?
If I want to warp a non-human object, like clothing, which weight should I choose?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.