Giter Site home page Giter Site logo

slr-sfs's Introduction

SLR-SFS

[ICCV2023 Oral]Simulating Fluids in Real-World Still Images

Code release for the paper Simulating Fluids in Real-World Still Images

Authors: Siming Fan, Jingtan Piao, Chen Qian, Kwan-Yee Lin, Hongsheng Li.

[Paper] [Project Page] [Demo Video]

Our SLR sample (Still Input Image | Animated Video(480x256)):

Our SFS sample (Still Input Image | Animated Video(480x256)):

Introduction

In this work, we tackle the problem of real-world fluid animation from a still image. We propose a new and learnable representation, surface-based layered representation(SLR), which decomposes the fluid and the static objects in the scene, to better synthesize the animated videos from a single fluid image. and design a surface-only fluid simulation(SFS) to model the evolution of the image fluids with better visual effects.

For more details of SLR-SFS, please refer to our paper and project page.

News

  • [14/07/2023] Our paper has been accepted by ICCV 2023!
  • [10/06/2022] Code, pretrained model of Motion Regressor from single image and sparse hint are updated.
  • [04/05/2022] Colab updated. Huggingface will be updated soon.
  • [26/04/2022] Technical report, code, CLAW testset released.
  • [01/04/2022] Project page is created.

Web Demo

We prepare a Colab demo to allow you to synthesize videos under gt motion, as well as editing effects. Motion regressor from single image is not supported in this version for the time being and will be updated soon.

Data preparetion

For evaluation:

Download our CLAW test set(314.8MB) (Description) and put it to SLR-SFS/data/CLAW/test

For training:

Download eulerian_data(Everything, 42.9GB) (Description)

Generate mean video of ground truth for background training.(Need 4.8GB)

cd data
python average_gt_video.py

Download our label(1MB) (Description) and put it to SLR-SFS/eulerian_data/fluid_region_rock_labels/all .Make sure opt.rock_label_data_path = "data/eulerian_data/fluid_region_rock_labels/all" in options/options.py contains label file, scene w/o file is considered to have no rock in moving region. You can check tensorboard.

SLR-SFS
├── data
│   ├── eulerian_data
│   │   ├── train
│   │   ├── validation
│   │   ├── imageset_shallow.npy # list of videos containing transparent fluid in train/*.gt.mp4
│   │   │── align*.json # speed align information for gt motion in validation/*_motion.pth to avoid large invalid pixels after warping
│   │   ├── avr_image # containing mean image of each video in train/*.gt.mp4  
│   ├── eulerian_data.py #(for baseline training)
│   ├── eulerian_data_balanced1_mask.py #(for SLR training)
│   ├── eulerian_data_bg.py #(for BG training)
│   ├── CLAW # mentioned in paper
│   │   ├── test
│   │   ├── align*.json
│   ├── CLAWv2 # new collected test set with higher resolution and more diverse scenes(containing fountains, oceans, beaches, mists, etc., which is not included in CLAW_data)
│   │   ├── test
│   │   ├── align*.json

Setup

conda create -n SLR python=3.9 -y
conda activate SLR
conda install -c pytorch pytorch==1.10.0 torchvision #pytorch/linux-64::pytorch-1.10.0-py3.9_cuda11.3_cudnn8.2.0_0

pip install tqdm opencv-python py-lz4framed matplotlib 
conda install cupy -c conda-forge
pip install lpips# for evaluation
pip install av tensorboardX tensorboard # for training

Inference with GT Motion

Download our pretrained model mentioned in Table 1,2 of the paper:

Model LPIPS of CLAW(All;Fluid) Description
baseline2 0.2078;0.2041 Modified Holynski(Baseline): 100epochs + (lower learning rate)50epochs
Ours_stage_1 0.2143;0.2100 Ours(Stage 1): 100epochs
Ours_stage_2 0.2411;0.2294 Background Only, used in Ours(Stage 2): 100epochs, and used as background initialization of the stage 3 training of Ours_v1
Ours_v1 0.2040;0.1975 Ours: 100epochs baseline2(stage 1) + 100epochs BG(stage 2) + (lower learning rate)50 epochs Ours(stage 3)
Ours_v1_ProjectPage 0.2060;0.1992 Selected with the best TotalLoss(Perctual Loss, MaskLoss mainly) of eulerian_data validation set, while the previous models are selected with the best Perceptual Loss. This pretrained model can be used to reproduce the results in our Project Page. Decomposition results is a little better than "Ours_v1"

1.For evaluation under aligned gt motion:

# For our v1 model, 60 frames, gt motion(Table 1,2 in the paper)
bash test_animating/CLAW/test_v1.sh
bash evaluation/eval_animating_CLAW.sh
# For baseline2 model, 60 frames, gt motion(Table 1,2 in the paper)
bash test_animating/CLAW/test_baseline2.sh
bash evaluation/eval_animating_CLAW.sh
## You can also use sbatch script test_animating/test_sbatch_2.sh
## For eulerian_data validation set, use the script in test_animating/eulerian_data

2.You can also use aligned gt motion to avoid large holes for better animation:

bash test_animating/CLAW/test_v1_align.sh

Results will be the same as:

3.Run with smaller resolution by replacing 768 to 256 in test_animating/CLAW/test_v1_align.sh, etc.

Inference with Sparse Hint and Mask

Model LPIPS of CLAW(All;Fluid) Description
motion2 - Controllable-Motion: Ep200
baseline2+motion2 Ongoing Modified Holynski(Baseline): 100epochs + Controllable-Motion: Ep200
baseline2+motion2+fixedMotionFinetune Ongoing Modified Holynski(Baseline): 100epochs + Controllable-Motion: Ep200 + Fixed Motion and finetune fluid: Ep 50

For evaluation under 5 sparse hint from and mask from gt motion:

bash test_animating/CLAW/test_baseline_motion.sh
bash evaluation/eval_animating_CLAW.sh

Training

1.To train baseline model under gt motion, run the following scripts

# For baseline training
bash train_animating_scripts/train_baseline1.sh
# For baseline2 training (w/ pconv)
bash train_animating_scripts/train_baseline2_pconv.sh

Note: Please refer to "Animating Pictures with Eulerian Motion Fields" for More information.

2.To train our SLR model under gt motion, run the following scripts

# Firstly, train Surface Fluid Layer for 100 epochs
bash train_animating_scripts/train_baseline2_pconv.sh
# Secondly, generate "mean video" and train Background Layer for 100 epochs
bash train_animating_scripts/train_bg.sh
# Lastly, unzip the label file to proper directory, train alpha,  finetune Fluid and BG.
bash train_alpha_finetuneBG_finetuneFluid_v1.sh
(Notice: check tensorboard to see whether your groundtruth alpha is right)

3.To train motion , run the following scripts

# For controllable motion training with motion GAN
bash train_animating_scripts/train_motion_scripts/train_motion_EPE_MotionGAN.sh

Note: Please refer to "Controllable Animation of Fluid Elements in Still Images" for More information.

4.To finetune baseline model , run the following scripts

# First, run the stage 1 to train baseline fluid
bash train_animating_scripts/train_baseline2_pconv.sh
# Second, run the stage 2 to train motion
bash train_animating_scripts/train_motion_scripts/train_motion_EPE_MotionGAN.sh
# Finally, fixed motion estimation and finetune fluid 
bash train_animating_scripts/train_animating_fixedMotion_finetuneFluid_IGANonly.sh

You can use tensorboard to check the training in the logging directory.

ToDo list

  • pretrained model and code of our reproduced Holynski's method(w/o motion estimation)
  • pretrained model and code of SLR
  • pretrained model and code of motion estimation from single image
  • CLAWv2 testset
  • Simple UI for Motion Editing
  • code of SFS

Citation

If you find this work useful in your research, please consider cite:

@article{fan2022SLR,
  author    = {Siming Fan, Jingtan Piao, Chen Qian, Kwan-Yee Lin, Hongsheng Li},
  title     = {Simulating Fluids in Real-World Still Images},
  journal   = {arXiv preprint},
  volume    = {arXiv:2204.11335},
  year      = {2022},
}

slr-sfs's People

Contributors

simon3dv avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.