Giter Site home page Giter Site logo

background-matting's Introduction

Background Matting: The World is Your Green Screen

alt text

By Soumyadip Sengupta, Vivek Jayaram, Brian Curless, Steve Seitz, and Ira Kemelmacher-Shlizerman

This paper will be presented in IEEE CVPR 2020.

Go to Project page for additional details and results.

We recently released a brand new background matting project: better quality and REAL-TIME performance (30fps at 4K and 60fps at FHD)! You can now use this with Zoom! Much better quality! We tried this on a Linux machine with a GPU.

Check out the code!

Project members

Acknowledgement: Andrey Ryabtsev, University of Washington

License

This work is licensed under the Creative Commons Attribution NonCommercial ShareAlike 4.0 License.

Summary

Updates

April 21, 2020:

April 20,2020

April 9, 2020

  • Issues:
    • Updated alignment function in pre-processing code. Python version uses AKAZE features (SIFT and SURF is not available with opencv3), MATLAB version also provided uses SURF features.
  • New features:

April 8, 2020

  • Issues:
    • Turning off adjustExposure() for bias-gain correction in test_pre_processing.py. (Bug found, need to be fixed)
    • Incorporating 'uncropping' operation in test_background-matting_image.py. (Output will be of same resolution and aspect-ratio as input)

Getting Started

Clone repository:

git clone https://github.com/senguptaumd/Background-Matting.git

Please use Python 3. Create an Anaconda environment and install the dependencies. Our code is tested with Pytorch=1.1.0, Tensorflow=1.14 with cuda10.0

conda create --name back-matting python=3.6
conda activate back-matting

Make sure CUDA 10.0 is your default cuda. If your CUDA 10.0 is installed in /usr/local/cuda-10.0, apply

export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64
export PATH=$PATH:/usr/local/cuda-10.0/bin

Install PyTorch, Tensorflow (needed for segmentation) and dependencies

conda install pytorch=1.1.0 torchvision cudatoolkit=10.0 -c pytorch
pip install tensorflow-gpu==1.14.0
pip install -r requirements.txt

Note: The code is likely to work on other PyTorch and Tensorflow versions compatible with your system CUDA. If you already have a working environment with PyTorch and Tensorflow, only install dependencies with pip install -r requirements.txt. If our code fails due to different versions, then you need to install specific CUDA, PyTorch and Tensorflow versions.

Run the inference code on sample images

Data

To perform Background Matting based green-screening, you need to capture:

  • (a) Image with the subject (use _img.png extension)
  • (b) Image of the background without the subject (use _back.png extension)
  • (c) Target background to insert the subject (place in data/background)

Use sample_data/ folder for testing and prepare your own data based on that. This data was collected with a hand-held camera.

Pre-trained model

Please download the pre-trained models from Google Drive and place Models/ folder inside Background-Matting/.

Note: syn-comp-adobe-trainset model was trained on the training set of the Adobe dataset. This was the model used for numerical evaluation on Adobe dataset.

Pre-processing

  1. Segmentation

Background Matting needs a segmentation mask for the subject. We use tensorflow version of Deeplabv3+.

cd Background-Matting/
git clone https://github.com/tensorflow/models.git
cd models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
cd ../..
python test_segmentation_deeplab.py -i sample_data/input

You can replace Deeplabv3+ with any segmentation network of your choice. Save the segmentation results with extension _masksDL.png.

  1. Alignment

Skip this step, if your data is captured with fixed-camera.

  • For hand-held camera, we need to align the background with the input image as a part of pre-processing. We apply simple hoomography based alignment.
  • We ask users to disable the auto-focus and auto-exposure of the camera while capturing the pair of images. This can be easily done in iPhone cameras (tap and hold for a while).

Run python test_pre_process.py -i sample_data/input for pre-processing. It aligns the background image _back.png and changes its bias-gain to match the input image _img.png

We used AKAZE features python code (since SURF and SIFT unavilable in opencv3) for alignment. We also provide an alternate MATLAB code (test_pre_process.m), which uses SURF features. MATLAB code also provides a way to visualize feature matching and alignment. Bad alignment will produce bad matting output. Bias-gain adjustment is turned off in the Python code due to a bug, but it is present in MATLAB code. If there are significant exposure changes between the captured image and the captured background, use bias-gain adjustment to account for that.

Feel free to write your own alignment code, choose your favorite feature detector, feature matching and alignment.

Background Matting

python test_background-matting_image.py -m real-hand-held -i sample_data/input/ -o sample_data/output/ -tb sample_data/background/0001.png

For images taken with fixed camera (with a tripod), choose -m real-fixed-cam for best results. -m syn-comp-adobe lets you use the model trained on synthetic-composite Adobe dataset, without real data (worse performance).

Run the inference code on sample videos

This is almost exactly similar as that of the image with few small changes.

Data

To perform Background Matting based green-screening, you need to capture:

  • (a) Video with the subject (teaser.mov)
  • (b) Image of the background without the subject (use teaser_back.png extension)
  • (c) Target background to insert the subject (place in target_back.mov)

We provide sample_video/ captured with hand-held camera and sample_video_fixed/ captured with fixed camera for testing. Please download the data and place both folders under Background-Matting. Prepare your own data based on that.

Pre-processing

  1. Frame extraction:
cd Background-Matting/sample_video
mkdir input background
ffmpeg -i teaser.mov input/%04d_img.png -hide_banner
ffmpeg -i target_back.mov background/%04d.png -hide_banner

Repeat the same for sample_video_fixed

  1. Segmentation
cd Background-Matting/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
cd ../..
python test_segmentation_deeplab.py -i sample_video/input

Repeat the same for sample_video_fixed

  1. Alignment

No need to run alignment for sample_video_fixed or videos captured with fixed-camera.

Run python test_pre_process_video.py -i sample_video/input -v_name sample_video/teaser_back.png for pre-processing. Alternately you can also use test_pre_process_video.m in MATLAB.

Background Matting

For hand-held videos, like sample_video:

python test_background-matting_image.py -m real-hand-held -i sample_video/input/ -o sample_video/output/ -tb sample_video/background/

For fixed-camera videos, like sample_video_fixed:

python test_background-matting_image.py -m real-fixed-cam -i sample_video_fixed/input/ -o sample_video_fixed/output/ -tb sample_video_fixed/background/ -b sample_video_fixed/teaser_back.png

To obtain the video from the output frames, run:

cd Background-Matting/sample_video
ffmpeg -r 60 -f image2 -i output/%04d_matte.png -vcodec libx264 -crf 15 -s 1280x720 -pix_fmt yuv420p teaser_matte.mp4
ffmpeg -r 60 -f image2 -i output/%04d_compose.png -vcodec libx264 -crf 15 -s 1280x720 -pix_fmt yuv420p teaser_compose.mp4

Repeat same for sample_video_fixed

Notes on capturing images

For best results capture images following these guidelines:

  • Choose a background that is mostly static, can be both indoor and outdoor.
  • Avoid casting any shadows of the subject on the background.
    • place the subject atleast few feets away from the background.
    • if possible adjust the lighting to avoid strong shadows on the background.
  • Avoid large color coincidences between subject and background. (e.g. Do not wear a white shirt in front of a white wall background.)
  • Lock AE/AF (Auto-exposure and Auto-focus) of the camera.
  • For hand-held capture, you need to:
    • allow only small camera motion by continuing to holding the camera as the subject exists the scene.
    • avoid backgrounds that has two perpendicular planes (homography based alignment will fail) or use a background very far away.
    • The above restirctions do not apply for images captured with fixed camera (on a tripod)

Training on synthetic-composite Adobe dataset

Data

  • Download original Adobe matting dataset: Follow instructions.
  • Separate human images: Use test_data_list.txt and train_data_list.txt in Data_adobe to copy only human subjects from Adobe dataset. Create folders fg_train, fg_test, mask_train, mask_test to copy foreground and alpha matte for test and train data separately. (The train test split is same as the original dataset.) You can run the following to accomplish this:
cd Data_adobe
./prepare.sh /path/to/adobe/Combined_Dataset
  • Download background images: Download MS-COCO images and place it in bg_train and in bg_test.
  • Compose Adobe foregrounds onto COCO for the train and test sets. This saves the composed result as _comp and the background as _back under merged_train and merged_test. It will also create a CSV to be used by the training dataloader. You can pass --workers 8 to use e.g. 8 threads, though it will use only one by default.
python compose.py --fg_path fg_train --mask_path mask_train --bg_path bg_train --out_path merged_train --out_csv Adobe_train_data.csv
python compose.py --fg_path fg_test --mask_path mask_test --bg_path bg_test --out_path merged_test

Training

Change number of GPU and required batch-size, depending on your platform. We trained the model with 512x512 input (-res flag).

CUDA_VISIBLE_DEVICES=0,1 python train_adobe.py -n Adobe_train -bs 4 -res 512

Notes:

  • 512x512 is the maximum input resolution we recommend for training
  • If you decreasing training resolution to 256x256, change -res 256, but we also recommend using lesser residual blocks. Use: -n_blocks1 5 -n_blocks2 2.

Cheers to the unofficial Deep Image Matting repo.

Training on unlabeled real videos

Data

Please download our captured videos.. We will show next how to finetune your model on fixed-camera captured videos. It will be similar for hand-held cameras, except you will need to align the captured background image to each frame of the video separately. (Take a hint from test_pre_process.py and use alignImages().)

Data Pre-processing:

  • Extract frames for each video: ffmpeg -i $NAME.mp4 $NAME/%04d_img.png -hide_banner
  • Run Segmentation (follow instructions on Deeplabv3+) : python test_segmentation_deeplab.py -i $NAME
  • Target background for composition. For self-supervised learning we need some target backgrounds that has roughly similar lighting as the original videos. Either capture few videos of indoor/outdoor scenes without humans or use our captured background in the background folder.
  • Create a .csv file Video_data_train.csv with each row as: $image;$captured_back;$segmentation;$image+20frames;$image+2*20frames;$image+3*20frames;$image+4*20frames;$target_back. The process is automated by prepare_real.py -- take a look inside and change background_path and path before running.

Training

Change number of GPU and required batch-size, depending on your platform. We trained the model with 512x512 input (-res flag).

CUDA_VISIBLE_DEVICES=0,1 python train_real_fixed.py -n Real_fixed -bs 4 -res 512 -init_model Models/syn-comp-adobe-trainset/net_epoch_64.pth

Dataset

We captured videos with both fixed and hand-held camera in indoor and outdoor settings. We release this data to encourage future research on improving background matting. The data is released for research purposes only.

Download data

Google Colab

Thanks to Andrey Ryabstev for creating Google Colab version for easy inference on images and videos of your choice.

Google Colab

Notes

We are eager to hear how our algorithm works on your images/videos. If the algorithm fails on your data, please feel free to share it with us at [email protected]. This will help us in improving our algorithm for future research. Also, feel free to share any cool results.

Citation

If you use this code for your research, please consider citing:

@InProceedings{BMSengupta20,
  title={Background Matting: The World is Your Green Screen},
  author = {Soumyadip Sengupta and Vivek Jayaram and Brian Curless and Steve Seitz and Ira Kemelmacher-Shlizerman},
  booktitle={Computer Vision and Pattern Regognition (CVPR)},
  year={2020}
}

Related Implementations

Microsoft Virtual Stage: Using our background matting technology along with depth sensing with Kinect, Microsoft opensourced this amazing code for virtual staging. Follow this link for details of their technique.

Weights & Biases: Great presentation and detailed discussions and insights on pre-processing and training our model. Check out Two Minutes Paper's take on our work.

background-matting's People

Contributors

badbye avatar senguptaumd avatar vivjay30 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

background-matting's Issues

What does it mean to set α = 1 everywhere for GReal?

So GReal also ouputs F, α but we set α to 1's...wouldn't that mean our composite will just be the foreground (with a black background)? Is that you mean by "would result in simply
copying the entire input image into the composite passed
to D" because I don't see why it is "indeed real".

Thank you once again! Really appreciate your replies :)

question about datasets

hi,guys!
I'm very curious about your datasets.
Can you release your datasets as soon as possible?
tks!!!

Why use your own backgrounds?

I read the Adobe paper about their dataset. I also dropped them a email. To my understanding they have 455 ground truth foregrounds and alpha mattes that they put onto 100 backgrounds each. I'm still awaiting their reply as to whether their dataset includes those 100 stand-alone backgrounds. Meanwhile I would like to hear from yourself as to why you decided to make your own set of synthetic composite with backgrounds drawn from MS COCO. Thank you.

How to make this work real time

Hi,
Can this support video communication with resolution 1280*720 and frame rate 20fps above? Which work need to be done to support that?

can it run from video to video?

hi, saw this interesting work in git trending. wonderful work.
from the sampledata, it seems the bachground matting is pic to pic.
can it run from video to video? (input the video and output the target video)?

What's the point of M = {I,I,I,I}?

Was the effects of leaving M behind studied for single-images? From what I see, it looks like it was left behind for the sake of being able to use the model for both photo and video. Yet it seems that this would cause some bias seens the model would have seen 4 duplicate channels of the same image on top of the 3 colored channels of the original image.

Error occured while I running code on my video, some limits to video or misconfig enviorment?

!CUDA_VISIBLE_DEVICES=0 python test_background-matting_image.py -m real-fixed-cam -i sample_video_fixed/input/ -o sample_video_fixed/output/ -tb sample_video_fixed/background/ -b sample_video_fixed/teaser_back.png

CUDA Device: 0
Using video mode
Traceback (most recent call last):
File "test_background-matting_image.py", line 121, in
bbox=get_bbox(rcnn,R=bgr_img0.shape[0],C=bgr_img0.shape[1])
File "/content/drive/My Drive/background_matting/Background-Matting/functions.py", line 38, in get_bbox
x1, y1 = np.amin(where, axis=1)
File "<array_function internals>", line 6, in amin
File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 2746, in amin
keepdims=keepdims, initial=initial, where=where)
File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation minimum which has no identity

repeat downloading the deeplab model file

Every time I run the test_segmentation_deeplab.py file, it always creates a new temporary folder and takes a long time to download the same model file.
Why not create a new folder "deeplab_model"?

a error when the segmentation (Deeplab) and matting run together.

Hi
I load the models of segmentation and matting at the same time, and hope to get the segmentation img by the segmentation net, and matting by the matting net. But:

2020-05-25 11:06:13.415797: E tensorflow/stream_executor/cuda/cuda_dnn.cc:324] Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.4.2. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2020-05-25 11:06:13.418751: E tensorflow/stream_executor/cuda/cuda_dnn.cc:324] Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.4.2. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
Traceback (most recent call last):
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node MobilenetV2/Conv/Conv2D}}]]
[[{{node SemanticPredictions}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "demo.py", line 10, in
img_seg = P.pred(img)
File "/home/sundy/Background-Matting/pred_seg.py", line 162, in pred
res_im,seg = self.MODEL.run(image)
File "/home/sundy/Background-Matting/pred_seg.py", line 70, in run
feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node MobilenetV2/Conv/Conv2D (defined at /home/sundy/Background-Matting/pred_seg.py:49) ]]
[[node SemanticPredictions (defined at /home/sundy/Background-Matting/pred_seg.py:49) ]]

Caused by op 'MobilenetV2/Conv/Conv2D', defined at:
File "demo.py", line 6, in
P = pred_seg()
File "/home/sundy/Background-Matting/pred_seg.py", line 154, in init
self.MODEL = DeepLabModel(download_path)
File "/home/sundy/Background-Matting/pred_seg.py", line 49, in init
tf.import_graph_def(graph_def, name='')
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
_ProcessNewOps(graph)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 235, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3433, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3433, in
for c_op in c_api_util.new_tf_operations(self)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3325, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node MobilenetV2/Conv/Conv2D (defined at /home/sundy/Background-Matting/pred_seg.py:49) ]]
[[node SemanticPredictions (defined at /home/sundy/Background-Matting/pred_seg.py:49) ]]

shape miss,After Update(4.8)

CUDA Device: 0
Using image mode
Traceback (most recent call last):
File "test_background-matting_image.py", line 179, in
comp_im_tr1=composite4(fg_out0,back_img10,alpha_out0)
File "/content/drive/My Drive/Background-Matting/functions.py", line 8, in composite4
im = alpha * fg + (1 - alpha) * bg
ValueError: operands could not be broadcast together with shapes (1080,1920,1) (1080,1980,3)

background/0002.png miss.
0001.png is ok.

cv Assertion failed

Hi, I tried to run my own footage but this came up when I run

python test_background-matting_image.py -m real-hand-held -i sample_video/input/ -o sample_video/output/ -tb sample_video/background/
Traceback (most recent call last):
  File "test_background-matting_image.py", line 85, in <module>
    bg_im0=cv2.imread(os.path.join(data_path, filename.replace('_img','_back'))); bg_im0=cv2.cvtColor(bg_im0,cv2.COLOR_BGR2RGB);
cv2.error: OpenCV(3.4.5) C:\projects\opencv-python\opencv\modules\imgproc\src\color.cpp:181: error: (-215:Assertion failed) !_src.empty() in function 'cv::cvtColor'

Did I fail to setup anything?

Everyday objects | Looking for suggestions

Following are the images I provided.
Image:
mug_img
Background:
mug_back
Target Background:
target
I used Facebook's Detectron2 to get the mask. The mask is clearly not good.
mug_masksDL
following are the results that I got.
Compose:
mug_compose
Foreground:
mug_fg
Matte:
mug_matte
out:
mug_out

I'm thinking of retraining the Adobe network on all 450 images instead just the non-transparent ones as mentioned in the paper. I'm also in search of a better segmentation model. One that is trained on everyday objects instead of just humans. Please let me know if you're aware of any.

Do you have any other suggestions that I should look into? Please let me know, thanks!

weird output from image pre processing

download
I took pictures with my iPhone7. It was a JPG so I just renamed it to the .png extension. It was upside down when I uploaded it (something got to do with the metadata at the point of capture). This was my output from the preprocessing script.

memory increased considerably

when i run test_background-matting_image.py to test images, i found that the video memory increased considerably at the beginning time, and then reduced, what is the problem?

error occured KeyError: 'CUDA_VISIBLE_DEVICES'

error occured when executed this command :
python test_background-matting_image.py -m real-hand-held -i sample_video/input/ -o sample_video/output/ -tb sample_video/background/

Traceback (most recent call last):
File "test_background-matting_image.py", line 20, in
print('CUDA Device: ' + os.environ["CUDA_VISIBLE_DEVICES"])
File "/home/dwijayanto/anaconda3/envs/back-matting/lib/python3.6/os.py", line 669, in getitem
raise KeyError(key) from None
KeyError: 'CUDA_VISIBLE_DEVICES'

TensorFlow version

Hi,
are you sure about the version of TensorFlow using CUDA 10.0? Tensorflow 1.4 requires CUDA 8.
I think you meant version 1.14.
I'm trying with 1.15 and now it works well.

IndexError: list index out of range

CUDA Device: 0
Using video mode
Traceback (most recent call last):
File "test_background-matting_image.py", line 54, in
model_name1=fo[0]
IndexError: list index out of range

Is the code wrong in the networks.py?

Is there some errors or specific setting in the line 99 (networks.py)?
oth_feat=torch.cat([self.comb_back(torch.cat([img_feat,back_feat],dim=1)),self.comb_seg(torch.cat([img_feat,seg_feat],dim=1)),self.comb_multi(torch.cat([img_feat,back_feat],dim=1))],dim=1)

May be it should be self.comb_multi(torch.cat([img_feat,multi_feat],dim=1))],dim=1))?

windows env conda support?

How to run on windows? seems path problem

Background-Matting-master\Models\real-fixed-cam>export PYTHONPATH=$PYTHONPATH:pwd:pwd/slim
'export' is not recognized as an internal or external command,
operable program or batch file.

Can it be work in google colab?

Assertion failed

There is no problem in running the alignment part of the demo.What is the reason for the following error when running my own picture?

python test_pre_process.py -i sample_data/my_photo/

Traceback (most recent call last):
File "test_pre_process.py", line 108, in
back_align = alignImages(back, image,mask)
File "test_pre_process.py", line 42, in alignImages
im1Reg = cv2.warpPerspective(im1, h, (width, height))
cv2.error: OpenCV(3.4.5) /io/opencv/modules/imgproc/src/imgwarp.cpp:2927: error: (-215:Assertion failed) (M0.type() == CV_32F || M0.type() == CV_64F) && M0.rows == 3 && M0.cols == 3 in function 'warpPerspective'

scipy missing from requirements

Hi,

I was trying the background-Matting test_background-matting_image.py
Testing on Windows
Did installations from requirements.txt

I got an error:
No module 'scipy'

Please add scipy to requirements.txt.
I Got it working after that :)

colab notebook "Process Image" throws error at test_background-matting_image.py

Running the colab notebook works fine till segmentation. Using sample images itself. I did 'fix' path etc. till this point. Getting this error

CUDA Device: 0
Using image mode
Traceback (most recent call last):
  File "Background-Matting/test_background-matting_image.py", line 121, in <module>
    bbox=get_bbox(rcnn,R=bgr_img0.shape[0],C=bgr_img0.shape[1])
  File "/content/Background-Matting/functions.py", line 38, in get_bbox
    x1, y1 = np.amin(where, axis=1)
  File "<__array_function__ internals>", line 6, in amin
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 2746, in amin
    keepdims=keepdims, initial=initial, where=where)
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation minimum which has no identity`

Will diagnose more and share updates

Question about video result

Hello, according to your latest code and video examples, I ran the results. But the result doesn't seem to be very good.

As shown in the following picture, there will be some blanks next to the character's hair.

1427_img

This is even more obvious when using my own dataset.

Do you have any suggestions?

Crop Area differences between frames in sequence, may be related to aberrations in mask.

First off,
This tool works very well in the situation that you describe and on the sample data. Also, the directions are good enough that I was able to run the sample as well as try my own data. Congratulations to that. One thing that is missing from many projects is good directions.

That said, it looks like minor differences in the categorization maskDL for each frame of the video may be causing the composited result frame to be cropped/resized differently.
146
147

I exported a video to PNG images as well as the same background for each frame. The camera is on a tripod so I used the real-fixed-cam model. I then ran the procedure with a solid green background so I can key the background out later.

This worked mostly exactly as the sample data but but there was a reflection against an appliance that made some aberrations in the deeplab mask.

This may have caused the subject focus to change between frames and 're-centered' the frame around the subject. In some cases, it resized the subject in one dimension only making some frames with the subject 'thin/compressed' and others with the subject normal.

Output foreground vs original foreground

Hi, question:

What's the difference between the output foreground, and let's say getting original pixels * (alpha > 0.95)? Is the network changing somehow the color information in the output?

Thanks!

Why reimplemented results are so terrible?

Following README.md, I re-implement test phase using sampled data(author offered)and real-fixed-cam model file. The results from deeplabV3 is good , but the matting results are pretty bad, is author upload a wrong model file?

img_seg
fg

Question: Why the multi frame feature was not used? Typo?

Hi,
In the networks.py, multi_feat is not used at all? Thank you so much! Should I change that to oth_feat=torch.cat([self.comb_back(torch.cat([img_feat,back_feat],dim=1)),self.comb_seg(torch.cat([img_feat,seg_feat],dim=1)),self.comb_multi(torch.cat([img_feat,multi_feat],dim=1))],dim=1)

def forward(self, image,back,seg,multi):
		img_feat1=self.model_enc1(image)
		img_feat=self.model_enc2(img_feat1)

		back_feat=self.model_enc_back(back)
		seg_feat=self.model_enc_seg(seg)
		multi_feat=self.model_enc_multi(multi)

oth_feat=torch.cat([self.comb_back(torch.cat([img_feat,back_feat],dim=1)),self.comb_seg(torch.cat([img_feat,seg_feat],dim=1)),self.comb_multi(torch.cat([img_feat,back_feat],dim=1))],dim=1)

		out_dec=self.model_res_dec(torch.cat([img_feat,oth_feat],dim=1))

		out_dec_al=self.model_res_dec_al(out_dec)
		al_out=self.model_al_out(out_dec_al)

		out_dec_fg=self.model_res_dec_fg(out_dec)
		out_dec_fg1=self.model_dec_fg1(out_dec_fg)
		fg_out=self.model_fg_out(torch.cat([out_dec_fg1,img_feat1],dim=1))

Self supervised target background for composition

Hi, first of all great work!

I'm testing your fixed-camera model on full body standing videos (with a fixed camera obviously) and, although is pretty good, there are still some errors in feet, near hands and between the legs.

After reading your post on towardsdatascience, I've retrained your final model with a couple of these videos, but, contrary to what I expected, the resulting inference has been slightly worse. I'm using the captured bacgrounds provided.

According to documentation, target backgrounds should have roughly similar lighting as the original videos. Could that be the cause? If that so, how could I create backgrounds with similar light as the video I'm trying to process?

Alpha mask with original fixed-cam model:
0001_out

Alpha mask with retrained model:
0001_out

Original Image:
0001_img

Offline processing

What is you Human in the loop hypotesis when you apply this to not realtime production?
How the artist will interact to fine-tune manually your results?

Question: Why the GAN loss was not used? Typo?

Hi, thanks for your work.

I found in compose_image_withshift() in functions.py, image_sh is wrapped by torch.autograd.Variable(), which will detach it from previous computation graph, preventing gradient back-propagation from loss_ganG to the generator. So I was wondering if loss_ganG was not used?

Could I change it to:

def compose_image_withshift(alpha_pred,fg_pred,bg,seg):

    image_sh=torch.zeros(fg_pred.shape).cuda()

    for t in range(0,fg_pred.shape[0]):
        al_tmp=to_image(seg[t,...]).squeeze(2)
        where = np.array(np.where((al_tmp>0.1).astype(np.float32)))
        x1, y1 = np.amin(where, axis=1)
        x2, y2 = np.amax(where, axis=1)

        #select shift
        n=np.random.randint(-(y1-10),al_tmp.shape[1]-y2-10)
        #n positive indicates shift to right
        alpha_pred_sh=torch.cat((alpha_pred[t,:,:,-n:],alpha_pred[t,:,:,:-n]),dim=2)
        fg_pred_sh=torch.cat((fg_pred[t,:,:,-n:],fg_pred[t,:,:,:-n]),dim=2)

        alpha_pred_sh=(alpha_pred_sh+1)/2

        image_sh[t,...]=fg_pred_sh*alpha_pred_sh + (1-alpha_pred_sh)*bg[t,...]

    # return torch.autograd.Variable(image_sh.cuda())
    return image_sh

Higher Crop Resolution + Congrats!

Hello guys,
I am a student at Carnegie Mellon University working right now on a video performance piece for one of my classes and I am completely blown away by this algorithm. It took me some time to make CUDA work on my PC, but after that, it worked perfectly using the image mode. Here, a short clip with the results (edited in Premiere):

https://vimeo.com/407383245

gif

I only would like to know if it is possible to change the 512x512 crop resolution to a higher value, so it can track a bigger motion area in the original image/video.

Thanks!

A question

hi,
if i have a picture with one persion(_img.png), but i have no the corresponding pic of background(_back.png), can i do the background-matting? because in real life , we just take the picture(_img.png) without background(_back.png),thanks.

Why hand-hold sample video looks not good

First of all, great work !!
I want to re-implement the result on sample videos as your procedure in the repo, but why I cannot get the same effect as yours. I'm confusing ... orz. I find the problem is that _back.png obtained by running test_pre_process_video.py looks incorrect, like following pictures in sample_video/input

This is one of frame extracted from teaser.mov
0001_img

This is the corresponding bg generated by running test_pre_process_video.py
0001_back

Obviously there is something wrong. As said in README.md, If there are significant exposure changes between the captured image and the captured background, use bias-gain adjustment to account for that. Should I turn on the part of bias-gain adjustment in the test_pre_process_video.py?
CAPTURE_202065_160347

Is that correct ? Thanks very much !!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.