Giter Site home page Giter Site logo

hubin858130 / anyv2v Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tiger-ai-lab/anyv2v

0.0 0.0 0.0 210.72 MB

A Plug-and-Play Framework For Any Video-to-Video Editing Tasks

Home Page: https://tiger-ai-lab.github.io/AnyV2V/

License: MIT License

Shell 0.01% Python 8.43% Jupyter Notebook 91.56%

anyv2v's Introduction

AnyV2V

arXiv contributors open issues pull requests license Hits Replicate

๐ŸŒ Homepage | ๐Ÿค— HuggingFace Paper | ๐Ÿ“– arXiv | ๐ŸŽฌ Replicate Demo

This repo contains the codebase for the paper "AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks"

AnyV2V

Introduction

AnyV2V is a tuning-free framework to achieve high appearance and temporal consistency in video editing.

  • can seamlessly build on top of advanced image editing methods to perform diverse types of editing
  • Utilizing I2V model's inherent knowledge to achieve robust performance on the four tasks:
    • prompt-based editing
    • reference-based style transfer
    • subject-driven editing
    • identity manipulation

๐Ÿ“ฐ News

โ–ถ๏ธ Quick Start for AnyV2V(i2vgen-xl)

Environment

Prepare the codebase of the AnyV2V project and Conda environment using the following commands:

git clone https://github.com/TIGER-AI-Lab/AnyV2V
cd AnyV2V

cd i2vgen-xl
conda env create -f environment.yml

๐Ÿ“œ Notebook Demo

We provide a notebook demo i2vgen-xl/demo.ipynb for AnyV2V(i2vgen-xl). You can run the notebook to perform a Prompt-Based Editing on a single video. Make sure the environment is set up correctly before running the notebook.

To edit multiple demo videos, please refer to the Video Editing section.

Video Editing

We provide demo source videos and edited images in the demo folder. Below are the instructions for performing video editing on the provided source videos. Navigate to i2vgen-xl/configs/group_ddim_inversion and i2vgen-xl/configs/group_pnp_edit:

  1. Modify the template.yaml files to specify the device.
  2. Modify the group_config.json files according to the provided examples. The configurations in group_config.json will override the configurations in template.yaml. To enable an example, set active: true; to disable it, set active: false.

Then you can run the following command to perform inference:

cd i2vgen-xl/scripts
bash run_group_ddim_inversion.sh
bash run_group_pnp_edit.sh

or run the following command using python:

cd i2vgen-xl/scripts

# First invert the latent of source video
python run_group_ddim_inversion.py \
--template_config "configs/group_ddim_inversion/template.yaml" \
--configs_json "configs/group_ddim_inversion/group_config.json"

# Then run Anyv2v pipeline with the source video latent
python run_group_pnp_edit.py \
--template_config "configs/group_pnp_edit/template.yaml" \
--configs_json "configs/group_pnp_edit/group_config.json"

To edit your own source videos, follow the steps outlined below:

  1. Prepare the source video Your-Video.mp4in the demo folder.
  2. Create two new folders demo/Your-Video-Name and demo/Your-Video-Name/edited_first_frame.
  3. Run the following command to perform first frame image editing:
python edit_image.py --video_path "./demo/Your-Video.mp4" --input_dir "./demo" --output_dir "./demo/Your-Video-Name/edited_first_frame" --prompt "Your prompt"

You can also use any other image editing method, such as InstantID, AnyDoor, or WISE, to edit the first frame. Please put the edited first frame images in the demo/Your-Video-Name/edited_first_frame folder.

  1. Add an entry to the group_config.json files located in i2vgen-xl/configs/group_ddim_inversion and i2vgen-xl/configs/group_pnp_edit directories for your video, following the provided examples.
  2. Run the inference command:
cd i2vgen-xl/scripts
bash run_group_ddim_inversion.sh
bash run_group_pnp_edit.sh

โ–ถ๏ธ Quick Start for AnyV2V(seine)

Please refer to ./seine/README.md

โ–ถ๏ธ Misc

First Frame Image Edit

We provide instructpix2pix port for image editing with instruction prompt.

usage: edit_image.py [-h] [--model {magicbrush,instructpix2pix}]
                     [--video_path VIDEO_PATH] [--input_dir INPUT_DIR]
                     [--output_dir OUTPUT_DIR] [--prompt PROMPT] [--force_512]
                     [--dict_file DICT_FILE] [--seed SEED]
                     [--negative_prompt NEGATIVE_PROMPT]

Process some images.

optional arguments:
  -h, --help            show this help message and exit
  --model {magicbrush,instructpix2pix}
                        Name of the image editing model
  --video_path VIDEO_PATH
                        Name of the video
  --input_dir INPUT_DIR
                        Directory containing the video
  --output_dir OUTPUT_DIR
                        Directory to save the processed images
  --prompt PROMPT       Instruction prompt for editing
  --force_512           Force resize to 512x512 when feeding into image model
  --dict_file DICT_FILE
                        JSON file containing files, instructions etc.
  --seed SEED           Seed for random number generator
  --negative_prompt NEGATIVE_PROMPT
                        Negative prompt for editing

Usage Example:

python edit_image.py --video_path "./demo/Man Walking.mp4" --input_dir "./demo" --output_dir "./demo/Man Walking/edited_first_frame" --prompt "turn the man into darth vader"

You can use other image models for editing, here are some online demo models that you can use:

Video Preprocess Script

As the current I2V models only support videos with 2 seconds (16 frames), we provide script to trim and crop video into the desired 2 second video with any dimension.

usage: prepare_video.py [-h] [--input_folder INPUT_FOLDER] [--video_path VIDEO_PATH] [--output_folder OUTPUT_FOLDER]
                        [--clip_duration CLIP_DURATION] [--width WIDTH] [--height HEIGHT] [--start_time START_TIME] [--end_time END_TIME]
                        [--n_frames N_FRAMES] [--center_crop] [--x_offset X_OFFSET] [--y_offset Y_OFFSET] [--longest_to_width]

Crop and resize video segments.

optional arguments:
  -h, --help            show this help message and exit
  --input_folder INPUT_FOLDER
                        Path to the input folder containing video files
  --video_path VIDEO_PATH
                        Path to the input video file
  --output_folder OUTPUT_FOLDER
                        Path to the folder for the output videos
  --clip_duration CLIP_DURATION
                        Duration of the video clips in seconds default=2
  --width WIDTH         Width of the output video (optional) default=512
  --height HEIGHT       Height of the output video (optional) default=512
  --start_time START_TIME
                        Start time for cropping (optional)
  --end_time END_TIME   End time for cropping (optional)
  --n_frames N_FRAMES   Number of frames to extract from each video
  --center_crop         Center crop the video
  --x_offset X_OFFSET   Horizontal offset for center cropping, range -1 to 1 (optional)
  --y_offset Y_OFFSET   Vertical offset for center cropping, range -1 to 1 (optional)
  --longest_to_width    Resize the longest dimension to the specified width

Usage Example:

python prepare_video.py --input_folder src_center_crop/ --output_folder processed --start_time 1 --center_crop --x_offset 0 --y_offset 0
python prepare_video.py --input_folder src_left_crop/ --output_folder processed --start_time 1 --center_crop --x_offset -1 --y_offset 0
python prepare_video.py --input_folder src_right_crop/ --output_folder processed --start_time 1 --center_crop --x_offset 1 --y_offset 0

๐Ÿ“‹ TODO

AnyV2V(i2vgen-xl)

  • Release the code for AnyV2V(i2vgen-xl)
  • Release a notebook demo
  • Release scripts for multiple image editing
  • Release a Gradio demo

AnyV2V(SEINE)

  • Release the code for AnyV2V(SEINE)

AnyV2V(ConsistI2V)

  • Release the code for AnyV2V(ConsistI2V)

Misc

  • Helper script to preprocess the source video
  • Helper script to obtain edited first frame from the source video

๐Ÿ–Š๏ธ Citation

Please kindly cite our paper if you use our code, data, models or results:

@article{ku2024anyv2v,
  title={AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks},
  author={Ku, Max and Wei, Cong and Ren, Weiming and Yang, Harry and Chen, Wenhu},
  journal={arXiv preprint arXiv:2403.14468},
  year={2024}
}

๐ŸŽซ License

This project is released under the the MIT License. However, our code is based on some projects that might used another license:

โญ Star History

Star History Chart

๐Ÿ“ž Contact Authors

Max Ku @vinemsuic, [email protected]
Cong Wei @lim142857, [email protected]
Weiming Ren @wren93, [email protected]

๐Ÿ’ž Acknowledgements

The code is built upon the below repositories, we thank all the contributors for open-sourcing.

anyv2v's People

Contributors

vinesmsuic avatar lim142857 avatar chenxwh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.