SLR-SFS

[ICCV2023 Oral]Simulating Fluids in Real-World Still Images

Code release for the paper Simulating Fluids in Real-World Still Images

Authors: Siming Fan, Jingtan Piao, Chen Qian, Kwan-Yee Lin, Hongsheng Li.

Our SLR sample (Still Input Image | Animated Video(480x256)):

Our SFS sample (Still Input Image | Animated Video(480x256)):

Introduction

In this work, we tackle the problem of real-world fluid animation from a still image. We propose a new and learnable representation, surface-based layered representation(SLR), which decomposes the fluid and the static objects in the scene, to better synthesize the animated videos from a single fluid image. and design a surface-only fluid simulation(SFS) to model the evolution of the image fluids with better visual effects.

For more details of SLR-SFS, please refer to our paper and project page.

News

[14/07/2023] Our paper has been accepted by ICCV 2023!
[10/06/2022] Code, pretrained model of Motion Regressor from single image and sparse hint are updated.
[04/05/2022] Colab updated. Huggingface will be updated soon.
[26/04/2022] Technical report, code, CLAW testset released.
[01/04/2022] Project page is created.

Web Demo

We prepare a Colab demo to allow you to synthesize videos under gt motion, as well as editing effects. Motion regressor from single image is not supported in this version for the time being and will be updated soon.

Data preparetion

For evaluation:

Download our CLAW test set(314.8MB) (Description) and put it to SLR-SFS/data/CLAW/test

For training:

Download eulerian_data(Everything, 42.9GB) (Description)

Generate mean video of ground truth for background training.(Need 4.8GB)

cd data
python average_gt_video.py

Download our label(1MB) (Description) and put it to SLR-SFS/eulerian_data/fluid_region_rock_labels/all .Make sure opt.rock_label_data_path = "data/eulerian_data/fluid_region_rock_labels/all" in options/options.py contains label file, scene w/o file is considered to have no rock in moving region. You can check tensorboard.

SLR-SFS
├── data
│   ├── eulerian_data
│   │   ├── train
│   │   ├── validation
│   │   ├── imageset_shallow.npy # list of videos containing transparent fluid in train/*.gt.mp4
│   │   │── align*.json # speed align information for gt motion in validation/*_motion.pth to avoid large invalid pixels after warping
│   │   ├── avr_image # containing mean image of each video in train/*.gt.mp4  
│   ├── eulerian_data.py #(for baseline training)
│   ├── eulerian_data_balanced1_mask.py #(for SLR training)
│   ├── eulerian_data_bg.py #(for BG training)
│   ├── CLAW # mentioned in paper
│   │   ├── test
│   │   ├── align*.json
│   ├── CLAWv2 # new collected test set with higher resolution and more diverse scenes(containing fountains, oceans, beaches, mists, etc., which is not included in CLAW_data)
│   │   ├── test
│   │   ├── align*.json

Setup

conda create -n SLR python=3.9 -y
conda activate SLR
conda install -c pytorch pytorch==1.10.0 torchvision #pytorch/linux-64::pytorch-1.10.0-py3.9_cuda11.3_cudnn8.2.0_0

pip install tqdm opencv-python py-lz4framed matplotlib 
conda install cupy -c conda-forge
pip install lpips# for evaluation
pip install av tensorboardX tensorboard # for training

Inference with GT Motion

Download our pretrained model mentioned in Table 1,2 of the paper:

Model	LPIPS of CLAW(All;Fluid)	Description
baseline2	0.2078;0.2041	Modified Holynski(Baseline): 100epochs + (lower learning rate)50epochs
Ours_stage_1	0.2143;0.2100	Ours(Stage 1): 100epochs
Ours_stage_2	0.2411;0.2294	Background Only, used in Ours(Stage 2): 100epochs, and used as background initialization of the stage 3 training of Ours_v1
Ours_v1	0.2040;0.1975	Ours: 100epochs baseline2(stage 1) + 100epochs BG(stage 2) + (lower learning rate)50 epochs Ours(stage 3)
Ours_v1_ProjectPage	0.2060;0.1992	Selected with the best TotalLoss(Perctual Loss, MaskLoss mainly) of eulerian_data validation set, while the previous models are selected with the best Perceptual Loss. This pretrained model can be used to reproduce the results in our Project Page. Decomposition results is a little better than "Ours_v1"

1.For evaluation under aligned gt motion:

# For our v1 model, 60 frames, gt motion(Table 1,2 in the paper)
bash test_animating/CLAW/test_v1.sh
bash evaluation/eval_animating_CLAW.sh
# For baseline2 model, 60 frames, gt motion(Table 1,2 in the paper)
bash test_animating/CLAW/test_baseline2.sh
bash evaluation/eval_animating_CLAW.sh
## You can also use sbatch script test_animating/test_sbatch_2.sh
## For eulerian_data validation set, use the script in test_animating/eulerian_data

2.You can also use aligned gt motion to avoid large holes for better animation:

bash test_animating/CLAW/test_v1_align.sh

Results will be the same as:

3.Run with smaller resolution by replacing 768 to 256 in test_animating/CLAW/test_v1_align.sh, etc.

Inference with Sparse Hint and Mask

Model	LPIPS of CLAW(All;Fluid)	Description
motion2	-	Controllable-Motion: Ep200
baseline2+motion2	Ongoing	Modified Holynski(Baseline): 100epochs + Controllable-Motion: Ep200
baseline2+motion2+fixedMotionFinetune	Ongoing	Modified Holynski(Baseline): 100epochs + Controllable-Motion: Ep200 + Fixed Motion and finetune fluid: Ep 50

For evaluation under 5 sparse hint from and mask from gt motion:

bash test_animating/CLAW/test_baseline_motion.sh
bash evaluation/eval_animating_CLAW.sh

Training

1.To train baseline model under gt motion, run the following scripts

# For baseline training
bash train_animating_scripts/train_baseline1.sh
# For baseline2 training (w/ pconv)
bash train_animating_scripts/train_baseline2_pconv.sh

Note: Please refer to "Animating Pictures with Eulerian Motion Fields" for More information.

2.To train our SLR model under gt motion, run the following scripts

# Firstly, train Surface Fluid Layer for 100 epochs
bash train_animating_scripts/train_baseline2_pconv.sh
# Secondly, generate "mean video" and train Background Layer for 100 epochs
bash train_animating_scripts/train_bg.sh
# Lastly, unzip the label file to proper directory, train alpha,  finetune Fluid and BG.
bash train_alpha_finetuneBG_finetuneFluid_v1.sh
(Notice: check tensorboard to see whether your groundtruth alpha is right)

3.To train motion , run the following scripts

# For controllable motion training with motion GAN
bash train_animating_scripts/train_motion_scripts/train_motion_EPE_MotionGAN.sh

Note: Please refer to "Controllable Animation of Fluid Elements in Still Images" for More information.

4.To finetune baseline model , run the following scripts

# First, run the stage 1 to train baseline fluid
bash train_animating_scripts/train_baseline2_pconv.sh
# Second, run the stage 2 to train motion
bash train_animating_scripts/train_motion_scripts/train_motion_EPE_MotionGAN.sh
# Finally, fixed motion estimation and finetune fluid 
bash train_animating_scripts/train_animating_fixedMotion_finetuneFluid_IGANonly.sh

You can use tensorboard to check the training in the logging directory.

ToDo list

pretrained model and code of our reproduced Holynski's method(w/o motion estimation)
pretrained model and code of SLR
pretrained model and code of motion estimation from single image
CLAWv2 testset
Simple UI for Motion Editing
code of SFS

Citation

If you find this work useful in your research, please consider cite:

@article{fan2022SLR,
  author    = {Siming Fan, Jingtan Piao, Chen Qian, Kwan-Yee Lin, Hongsheng Li},
  title     = {Simulating Fluids in Real-World Still Images},
  journal   = {arXiv preprint},
  volume    = {arXiv:2204.11335},
  year      = {2022},
}

lzy24601 / slr-sfs Goto Github PK

slr-sfs's Introduction