Giter Site home page Giter Site logo

backgroundmattingv2's Introduction

Real-Time High-Resolution Background Matting

Teaser

Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires capturing an additional background image and produces state-of-the-art matting results at 4K 30fps and HD 60fps on an Nvidia RTX 2080 TI GPU.

Disclaimer: The video conversion script in this repo is not meant be real-time. Our research's main contribution is the neural architecture for high resolution refinement and the new matting datasets. The inference_speed_test.py script allows you to measure the tensor throughput of our model, which should achieve real-time. The inference_video.py script allows you to test your video on our model, but the video encoding and decoding is done without hardware acceleration and parallization. For production use, you are expected to do additional engineering for hardware encoding/decoding and loading frames to GPU in parallel. For more architecture detail, please refer to our paper.

 

New Paper is Out!

Check out Robust Video Matting! Our new method does not require pre-captured backgrounds, and can inference at even faster speed!

 

Overview

 

Updates

  • [Jun 21 2021] Paper received CVPR 2021 Best Student Paper Honorable Mention.
  • [Apr 21 2021] VideoMatte240K dataset is now published.
  • [Mar 06 2021] Training script is published.
  • [Feb 28 2021] Paper is accepted to CVPR 2021.
  • [Jan 09 2021] PhotoMatte85 dataset is now published.
  • [Dec 21 2020] We updated our project to MIT License, which permits commercial use.

 

Download

Model / Weights

Video / Image Examples

Datasets

 

Demo

Scripts

We provide several scripts in this repo for you to experiment with our model. More detailed instructions are included in the files.

  • inference_images.py: Perform matting on a directory of images.
  • inference_video.py: Perform matting on a video.
  • inference_webcam.py: An interactive matting demo using your webcam.

Notebooks

Additionally, you can try our notebooks in Google Colab for performing matting on images and videos.

Virtual Camera

We provide a demo application that pipes webcam video through our model and outputs to a virtual camera. The script only works on Linux system and can be used in Zoom meetings. For more information, checkout:

 

Usage / Documentation

You can run our model using PyTorch, TorchScript, TensorFlow, and ONNX. For detail about using our model, please check out the Usage / Documentation page.

 

Training

Configure data_path.pth to point to your dataset. The original paper uses train_base.pth to train only the base model till convergence then use train_refine.pth to train the entire network end-to-end. More details are specified in the paper.

 

Project members

* Equal contribution.

 

License

This work is licensed under the MIT License. If you use our work in your project, we would love you to include an acknowledgement and fill out our survey.

Community Projects

Projects developed by third-party developers.

backgroundmattingv2's People

Contributors

andreyryabtsev avatar jinzishuai avatar peterl1n avatar ptrbrn avatar senguptaumd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

backgroundmattingv2's Issues

Load model error

I load your uploaded model using torch script python or c++; but got a memory access error... my torch is v1.5.1; beg your reply

no right result

i have done it with inference_video.py.
cmd is:
python inference_video.py
--model-type mattingrefine
--model-backbone resnet50
--model-backbone-scale 0.25
--model-refine-mode sampling
--model-refine-sample-pixels 80000
--model-checkpoint ./pytorch_resnet50.pth
--video-src ./test/test.mp4
--video-bgr ./test/bg.png
--video-resize 1920 1080
--output-dir output
--output-type com
i use the er1.mp4 you given and a new bg image.
i use the default seettings --> com = fgr * pha + bgr_green * (1 - pha)
and myself --> com = pha * fgr + (1 - pha) * bgr
And i get the results.
But the two com.mp4 are the same as the input src mp4---video-src ./test/test.mp4
it looks like no different in com.mp4.
the bg has no changed.
what happened ?

Multithreading Speed improvements

I am wondering if you guys know if further CPU speed improvements are possible. My MacBook Pro is using up only 2 cores and gets about 0.63x real time performance with resnet50. So wondering if we can speed up algorithm by using up all 8 cores.

the predicted results of crop patch will have residual zigzag in refine stage

Hi, Your paper is excellent, and many thanks for the inference code you contributed. I am a student majoring in CS, and I am trying to reproduce your paper.
I found that the predicted results of crop patch will have residual zigzag in refine stage. Have you ever encountered a similar situation and how to solve it?
Looking forward to your reply.

Not working

So I followed this tutorial on youtube,
https://www.youtube.com/watch?v=HlOUKj6WP-s&list=PLmo1GBItOimXfKR5t4D3f0doSflEgUo9j&index=3&t=474s
and installed everything I needed to install, activated everything, made sure picture and video were of same size and named properly and I cannot get the program to green screen me out. I have an NVIDIA graphics card. I used a sample image and video from this website and it worked, but mine wont work. It green screens random sections of the background but not everything. It's not a complicated scene, and it is on a tripod. Just me walking away for a few seconds and turning around. it is a 4k video. I cannot upload the original as it is too big so I am converting it too a smaller size and uploading for you to look at. Help me please.

Src-1.mp4

How to use the “/model" script?

I dont know how to follow the usage and documents, like below:
`import torch
from model import MattingRefine

device = torch.device('cuda')
precision = torch.float32

model = MattingRefine(backbone='mobilenetv2',
backbone_scale=0.25,
refine_mode='sampling',
refine_sample_pixels=80_000)

model.load_state_dict(torch.load('PATH_TO_CHECKPOINT.pth'))
model = model.eval().to(precision).to(device)

src = torch.rand(1, 3, 1080, 1920).to(precision).to(device)
bgr = torch.rand(1, 3, 1080, 1920).to(precision).to(device)

with torch.no_grad():
pha, fgr = model(src, bgr)[:2]`

where this code read my own image?Are the src = torch.rand(1, 3, 1080, 1920).to(precision).to(device) bgr = torch.rand(1, 3, 1080, 1920).to(precision).to(device)are only an example?I don't understand.
So I use this inference_images.py to infer my image. I sort my data and pretrained model like this:
2020-12-19 01-54-25屏幕截图
And add args like this:
--model-type mattingrefine --model-backbone resnet50 --model-checkpoint /home/user/BackgroundMattingV2/A_matting/model/PyTorch/pytorch_resnet50.pth --images-src /home/user/BackgroundMattingV2/A_matting/input/1384624442.jpg --images-bgr /home/user/BackgroundMattingV2/A_matting/input/timg.jpeg --output-dir /home/user/BackgroundMattingV2/A_matting/output/out.jpg --output-types com -y
It didn't work and error reported like 0it [00:00, ?it/s] and nothing export out.
Could you help me please?
Thanks!
Best,
Joevaen.

Running on MacOS (no Nvidia GPU)

Hi,
When I run the inference_video.py, I get following error:

AssertionError: Torch not compiled with CUDA enabled

I looked up online but there does not seem a way to get Cuda to run on my Macbook (which is 2020 model and pretty high end unfortunately). Any way, this can be run on MacOS?

Direct prediction

Hi!
I really like your paper, it's written so clearly. I think the whole community is thankful to you for the upcoming datasets. But I do not get one thing. Why do we need to use such a complicated approach if we can take just the difference between a background and a source? In this case, pixels which equal 0 will be our background.

problem running inference_images.py

It seems that there arer multiple threads running at the same time since I got this overrride question many times:

This is what I see when answering no (the output folder does not exist before)

(bgm2) C:\ZeroBox\src\BackgroundMattingV2> python inference_images.py --model-type mattingrefine --model-backbone mobilenetv2 --model-checkpoint PyTorch\pytorch_mobilenetv2.pth --images-src Group15BOriginals --images-bgr Group15BBackground --output-dir output-images --output-type com fgr pha
  0%|                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: n
  0%|                                                                                                                                                                                       | 0/1 [00:40<?, ?it/s]
Traceback (most recent call last):
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 872, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\queue.py", line 178, in get
    raise Empty
_queue.Empty

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "inference_images.py", line 123, in <module>
    for i, (src, bgr) in enumerate(tqdm(dataloader)):
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\tqdm\std.py", line 1171, in __iter__
    for obj in iterable:
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 435, in __next__
    data = self._next_data()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 1068, in _next_data
    idx, data = self._get_data()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 1024, in _get_data
    success, data = self._try_get_data()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 885, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 54132) exited unexpectedly

If I answer yes (the first yes is normal since the output folder now exists):

(bgm2) C:\ZeroBox\src\BackgroundMattingV2> python inference_images.py --model-type mattingrefine --model-backbone mobilenetv2 --model-checkpoint PyTorch\pytorch_mobilenetv2.pth --images-src Group15BOriginals --images-bgr Group15BBackground --output-dir output-images --output-type com fgr pha
Directory output-images already exists. Override? [Y/N]: y
  0%|                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: Directory output-images already exists. Override? [Y/N]: y
  0%|                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\ZeroBox\src\BackgroundMattingV2\inference_images.py", line 123, in <module>
    for i, (src, bgr) in enumerate(tqdm(dataloader)):
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\tqdm\std.py", line 1171, in __iter__
    for obj in iterable:
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 352, in __iter__
    return self._get_iterator()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 294, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 801, in __init__
    w.start()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
  0%|                                                                                                                                                                                       | 0/1 [00:25<?, ?it/s]
Traceback (most recent call last):
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 872, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\queue.py", line 178, in get
    raise Empty
_queue.Empty

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "inference_images.py", line 123, in <module>
    for i, (src, bgr) in enumerate(tqdm(dataloader)):
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\tqdm\std.py", line 1171, in __iter__
    for obj in iterable:
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 435, in __next__
    data = self._next_data()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 1068, in _next_data
    idx, data = self._get_data()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 1024, in _get_data
    success, data = self._try_get_data()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\utils\data\dataloader.py", line 885, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 45488) exited unexpectedly

I do have GPU and I was able to run the other two interference code without any issue.

Question about the function "compute_pixel_indices"

Hi Peter,

Thank you for sharing the source code which is well written and self-explanatory. However, could you explain the following lines, 278-279? I have pasted it below for your convenience. I know it would output the indices for the patch location corresponding to the original input x. However, could you explain the logic behind the following lines of code?

idx_pat = (c * H * W).view(C, 1, 1).expand([C, O, O]) + (o * W).view(1, O, 1).expand([C, O, O]) + o.view(1, 1, O).expand([C, O, O])
idx_loc = b * W * H + y * W * S + x * S
idx = idx_loc.view(-1, 1, 1, 1).expand([n, C, O, O]) + idx_pat.view(1, C, O, O).expand([n, C, O, O])

Kind regards,

How to get result image on C++

Thanks for your greate contributions. I refered model_usage for C++ , but I don't know how to transform the results and show

I also refered inference_webcam.py. I get inspiration from the code

pha, fgr = model(src, bgr)[:2]
res = pha * fgr + (1 - pha) * torch.ones_like(fgr)
res = res.mul(255).byte().cpu().permute(0, 2, 3, 1).numpy()[0]
res = cv2.cvtColor(res, cv2.COLOR_RGB2BGR)
key = dsp.step(res)

I need transform it to c++ but there still some questions.

    auto outputs = model.forward({src, bgr}).toTuple()->elements();
    auto pha = outputs[0].toTensor();
    auto fgr = outputs[1].toTensor();
    
   // the fllowing code is error, but I have no idea.
    auto res_tensor = (pha * fgr + (1-pha) * torch::ones_like(fgr)).mul(255).cpu();
    Mat res(res_tensor.size(2), res_tensor.size(3), CV_8UC3, (void*) res_tensor.data_ptr<uint8_t>());
    cvtColor(res, res, COLOR_RGB2BGR);
    imshow("matting", res);

Would you please show me the code to study?Thanks.

Model parameter values

Hi - I wanted to check if better settings of following values are possible

        --model-backbone-scale 0.25 
        --model-refine-mode 80000

For my experiment, I am using HD resolution videos.

PhotoMatte13K dataset source

Could you please tell some more words about PhotoMatte13K dataset source?
Did you order them in some photo studio or buy at photo stock?

--model-type mattingrefine

When I run inference_video with --model-type mattingrefine parameter, does it run mattingbase first and then refinement? Wondering if I need to run two passes myself or if the script takes care of this.

some issues about `inference_images.py`

some little errors I met, could you please give me ad.

inference_images.py: error: the following arguments are required: --model-backbone, --model-checkpoint, --images-src, --images-bgr, --output-dir, --output-types

how to use inference_images.py?

Thanks.

What's the difference between model type mattingbase and mattingrefine

I test inference_video.py with model type mattingbase and mattingrefine
the video format is 720P
When using mattingrefine, the iter speed about 4.57it/s
When using mattingbase, the iter speed about 1.2s/it
Why model type mattingrefine is much faster than model type mattingbase? Thanks.

python inference_video.py --model-type mattingbase
--model-backbone mobilenetv2
--model-backbone-scale 0.25
--model-refine-mode sampling
--model-refine-sample-pixels 80000
--model-checkpoint "./share/pytorch_mobilenetv2.pth"
--video-src "./share/src.mp4"
--video-bgr "./share/src.png"
--output-dir "./output/"
--device cpu
--output-type com

Get transparency instead of white opaque background / inference_webcam

I am trying to replace the white color of the inference_webcam.py background example by transparency.

So far, I have replaced pha * fgr + (1 - pha) * torch.ones_like(fgr) by torch.cat([pha.ne(0) * fgr, pha], dim=1) and cv2.imencode('.jpg', res) by cv2.imencode('.png', res) but it still not works.

Any suggestions ?

Thanks a lot.

foreground residual

First thank you for the great work you shared, I have a question. What is the foreground residual and what is it for?

Shadow Augmentation

The paper mentions that shadows were artificially added to make the model more robust to such situations. Using what method were these shadows created and will the code be shared for it?

ZeroDivisionError: integer division or modulo by zero

I have successfully converted a 440x440 video using colab. Now I'm trying with a HD video and received following error:
!python inference_video.py
--model-type mattingrefine
--model-backbone resnet50
--model-backbone-scale 0.25
--model-refine-mode sampling
--model-refine-sample-pixels 80000
--model-checkpoint "/content/model.pth"
--video-src "/content/balconay_test.mp4"
--video-bgr "/content/balcony_bg.jpg"
--output-dir "/content/output/"
--output-type com fgr pha err ref

0% 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
File "inference_video.py", line 178, in
for src, bgr in tqdm(DataLoader(dataset, batch_size=1, pin_memory=True)):
File "/usr/local/lib/python3.6/dist-packages/tqdm/std.py", line 1104, in iter
for obj in iterable:
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 475, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/BackgroundMattingV2/dataset/zip.py", line 17, in getitem
x = tuple(d[idx % len(d)] for d in self.datasets)
File "/content/BackgroundMattingV2/dataset/zip.py", line 17, in
x = tuple(d[idx % len(d)] for d in self.datasets)
ZeroDivisionError: integer division or modulo by zero
0% 0/1 [00:00<?, ?it/s]

--output-format image_sequences not working on the colab

Hi there,

I just made my first test and I'm very pleased with a 4K video.
but when I add the --output-format image_sequences I get this error
File "", line 2
--output-format image_secuences
^
IndentationError: unexpected indent

I would love to have some sequence as export...
Also...is there a posibility to use another type of file as input? I would love to have my image sequence as input so I can use it for real with some cinema camera files...

Thanx a lot!!

AssertionError: Datasets are not equal in length.

Thanks for nice work.I am running the file 'inference_images.py'

python inference_images.py --model-type=mattingrefine --model-backbone=resnet50 --model-backbone-scale 0.25 --model-refine-mode sampling --model-refine-sample-pixels 80000 --model-checkpoint=pytorch_resnet50.pth --images-src=myimages/ --images-bgr=back/ --device=cpu --output-dir=out/ --output-types=com

I am getting the following error

Traceback (most recent call last):
  File "inference_images.py", line 108, in <module>
    A.PairApply(T.ToTensor())
  File "/home/mylaptop/Downloads/BackgroundMattingV2/dataset/zip.py", line 11, in __init__
    assert len(datasets[i]) == len(datasets[i - 1]), 'Datasets are not equal in length.'
AssertionError: Datasets are not equal in length.

Issue installing torch-1.7

Hi,
Thanks for sharing this code.

I am trying to install this library and when I run following command:

sudo pip3 install -r requirements.txt

I get following error:

Collecting torch==1.7.0 (from -r requirements.txt (line 3))
  ERROR: Could not find a version that satisfies the requirement torch==1.7.0 (from -r requirements.txt (line 3)) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2)
ERROR: No matching distribution found for torch==1.7.0 (from -r requirements.txt (line 3))
WARNING: You are using pip version 19.2.3, however version 20.3.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

Can someone help?

Video Start and End timecode

Hi - this is to state that video start and end times would be a nice feature to add to the script, since many times user may not want to run entire video clip.

Dataset release schedule

Hi, thanks for your awesome work!
I recently do some researches about human body segmentation. I wonder do you have a schedule to provide the dataset which I am really interested in?

problem running the webcam inference script

As a first step to test this model, I got the conda environment set up and made sure pytorch=1.7 is installed properly with the GPU version etc.

But when I ran the script, I got the following error

(bgm2) C:\ZeroBox\src\BackgroundMattingV2> python inference_webcam.py --model-type mattingrefine --model-backbone resnet50 --model-checkpoint TorchScript/torchscript_resnet50_fp32.pth --resolution 1280 720
C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\serialization.py:589: UserWarning: 'torch.load' received a zip file that looks like a TorchScript archive dispatching to 'torch.jit.load' (call 'torch.jit.load' directly to silence this warning)
  warnings.warn("'torch.load' received a zip file that looks like a TorchScript archive"
Traceback (most recent call last):
  File "inference_webcam.py", line 138, in <module>
    model.load_state_dict(torch.load(args.model_checkpoint), strict=False)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\nn\modules\module.py", line 1025, in load_state_dict
    state_dict = state_dict.copy()
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\jit\_script.py", line 558, in __getattr__
    return super(RecursiveScriptModule, self).__getattr__(attr)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\jit\_script.py", line 288, in __getattr__
    return super(ScriptModule, self).__getattr__(attr)
  File "C:\Users\jinzi\miniconda3\envs\bgm2\lib\site-packages\torch\nn\modules\module.py", line 778, in __getattr__
    raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'RecursiveScriptModule' object has no attribute 'copy'

As a reference, here is some of my envrionment information

(bgm2) C:\ZeroBox\src\BackgroundMattingV2>conda list torch*
# packages in environment at C:\Users\jinzi\miniconda3\envs\bgm2:
#
# Name                    Version                   Build  Channel
pytorch                   1.7.0           py3.8_cuda102_cudnn7_0    pytorch
torchaudio                0.7.0                      py38    pytorch
torchvision               0.8.1                py38_cu102    pytorch

(bgm2) C:\ZeroBox\src\BackgroundMattingV2>python --version
Python 3.8.5

Do you know what is wrong here? Thanks a lot.

very pool result on MACOS with CPU

MACOS : 10.15.5
PYTHON :3.6.4
torch :1.7.0 (pip install ,not conda )
tensorflow : 1.14.0

hi @PeterL1n
I have tested inference_images.py by using this command :
python inference_images.py --model-type mattingrefine --model-backbone resnet101 --model-backbone-scale 0.25 --model-refine-mode sampling --model-refine-sample-pixels 80000 --model-checkpoint ./model/data/pytorch_resnet101.pth --images-src ./images/data --images-bgr ./images/bgr --output-dir ./images/output --output-type com fgr pha err ref --device cpu

I did not change any code of this file .
I downloaded 'pytorch_resnet101.pth' model and use it for testing.

It works , but the result is very pool .
This is my src image :

1

and this is my bgr image :
2

These are outputs :

com :
1

fgr:
1

ref :
1

err:
1

Do you have any idea about this problem?
thank you !

Real Time Performance

Thanks for such a nice work. I have a question related to real time inferences on the video input data.I have tested the resnet50 backbone model on the 1080p resolution videos.Results are good but it is really slow.How we can speed the inferences other than gpu(I have gpu gtx 1080) .Secondly webcam inferences results are very poor, my laptop webcam have resolution 640x480 and result are very poor.Can you guide how I can improve the results and speed the inferences.Thanks

anyone tried on iOS?

I wonder how fast inference would be on an iphone. Has anyone tried converting the model to CoreML?

ONNX model exception

Trying to load the included model "onnx_mobilenetv2_hd.onnx" with OpenCV we get the following exception:

cv2.error: OpenCV(4.4.0) C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-52oirelq\opencv\modules\dnn\src\graph_simplifier.cpp:76: error: (-212:Parsing error) Input node with name 901 not found in function 'cv::dnn::Subgraph::getInputNodeId'

We get the same error with a ONNX model file generated with the provided script:
python export_onnx.py --model-type mattingrefine --model-checkpoint "pytorch_mobilenetv2.pth" --model-backbone mobilenetv2 --model-backbone-scale 0.25 --model-refine-mode sampling --model-refine-sample-pixels 80000 --onnx-opset-version 12 --onnx-constant-folding --precision float32 --output "model_mobilenetv2.onnx" --validate

With the model "onnx_mobilenetv2_4k.onnx" we get a different error:
cv2.error: OpenCV(4.4.0) C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-52oirelq\opencv\modules\dnn\include\opencv2/dnn/shape_utils.hpp:222: error: (-215:Assertion failed) clamped.start < clamped.end in function 'cv::dnn::dnn4_v20200609::clamp'

We would like to convert the model to check performance running with ONNX runtime or with OpenVino, and the first step would be to get a ONNX model that can be open with OpenCV.

How can I optimize for CPU

Thanks for your awesome work! I hope to apply it on CPU. How can I optimize it for CPU ? Any suggestions? Thanks.

Running inference on a live webcam feed

How would I go about running the inference model live on a webcam feed? (to be eventually piped into video conference apps etc).

Is there a way to do this as an OBS filter? And what hardware acceleration would be required to get enough performance? MacOS for example wouldn't be able to do CUDA really.

The end goal is to have this show up as a virtual camera that can be selected in whatever video ingestion platform you use

High CPU usage in Webcam inference demo

I've tried the webcam inference demo, and it runs ~30fps on 640x360 resolution on my laptop's Nvidia GTX1050, which is really neat! However the CPU usage is 60-80%, while GPU utilization according to the task manager is only 6-10%. Is that something specific to how the python demo works - i.e. it shouldn't be CPU-intensive at all if properly used in C++ (torchscript)? Really wonder why the GPU usage is that low. Thank you

Cannot deal with background change

I just take a picture using my phone to get a background. And at the same position, I take another picture including my upper body. The demo you provided cannot produce a good result. I think the background just has small changes which are very common in real applications and the background is not complicated. Hope you can make it more applicable :)

Convert to ONNX

Hi, I'm very interested in your research.

Now I'm trying to convert pth file to onnx file by using export_onnx.py
But unfortunately, I got an error message below.

onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) bound to different types (tensor(float) and tensor(double) in node (ScatterElements_420).

My environment is Windows 10, python 3.7, ONNX ver 1.6.0

If you know what the problem is, I would be really grateful for any help.

[Transfer to CPU version] Refence BackgroundMattingV2 Colab you posted

Hi, thanks for your awesome work!

description

I have tried code you posed in colab. However, those code are running in GPU not CPU, I, unfortunatly, have not GPU. Then I wirite a CPU version according to your code.

Could you please help me to address this issue?

virtual-machine:~/BackgroundMattingV2$ python3 test.py
Traceback (most recent call last):
  File "test.py", line 25, in <module>
    pha, fgr = model(src, bgr)[:2]
TypeError: 'collections.OrderedDict' object is not callable

attachment code: python3

import torch
from torchvision.transforms.functional import to_tensor, to_pil_image
from PIL import Image

# model = torch.jit.load('model/pytorch_resnet50.pth').cuda().eval() 
"""
RuntimeError: Attempting to deserialize object on a CUDA device but 
torch.cuda.is_available() is False. If you are running on a CPU-only machine, 
please use torch.load with map_location=torch.device('cpu') to map your 
storages to the CPU.
"""

model = torch.load('model/pytorch_resnet50.pth',map_location ='cpu')

src = Image.open('images/img/12.png')
bgr = Image.open('images/bgr/12.png')

src = to_tensor(src)
bgr = to_tensor(bgr)

if src.size(1) <= 2048 and src.size(2) <= 2048:
  model.backbone_scale = 1/4
  model.refine_sample_pixels = 80_000
else:
  model.backbone_scale = 1/8
  model.refine_sample_pixels = 320_000

pha, fgr = model(src, bgr)[:2]

com = pha * fgr + (1 - pha) * torch.tensor([120/255, 255/255, 155/255], device='cuda').view(1, 3, 1, 1)

to_pil_image(com[0].cpu())

to_pil_image(pha[0].cpu()).save('pha.png')
to_pil_image(fgr[0].cpu()).save('fgr.png')
to_pil_image(com[0].cpu()).save('com.png')

Looking forward to your reply!

Best wishes,
@Charmve

issue after convert to onnx and openvino

Please check where is problem. Thanks
1 export onnx model
python export_onnx.py
--model-type mattingbase
--model-checkpoint "./pytorch_mobilenetv2.pth"
--model-backbone mobilenetv2
--model-backbone-scale 0.25
--model-refine-mode sampling
--model-refine-sample-pixels 80000
--model-refine-patch-crop-method roi_align
--model-refine-patch-replace-method scatter_element
--onnx-opset-version 11
--onnx-constant-folding
--precision float32
--output "model.onnx"
--validate

2 model optimize using openvino
python /opt/intel/openvino_2021/deployment_tools/model_optimizer/mo_onnx.py --input_model onnx_mobilenetv2_hd.onnx --input_shape [1,3,1080,1920],[1,3,1080,1920] --input src,bgr

3 infer using openvino

The result is as shown:
matting

[mov,mp4,m4a,3gp,3g2,mj2 @ 0xc5867600] moov atom not found

I uploaded a video file and background image and tried using BackgroundMattingV2-VideoMatting.ipynb, but it gives me the following error:

[mov,mp4,m4a,3gp,3g2,mj2 @ 0xc5867600] moov atom not found
VIDIOC_REQBUFS: Inappropriate ioctl for device
0% 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
File "inference_video.py", line 178, in
for src, bgr in tqdm(DataLoader(dataset, batch_size=1, pin_memory=True)):
File "/usr/local/lib/python3.6/dist-packages/tqdm/std.py", line 1104, in iter
for obj in iterable:
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 475, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/BackgroundMattingV2/dataset/zip.py", line 17, in getitem
x = tuple(d[idx % len(d)] for d in self.datasets)
File "/content/BackgroundMattingV2/dataset/zip.py", line 17, in
x = tuple(d[idx % len(d)] for d in self.datasets)
ZeroDivisionError: integer division or modulo by zero
0% 0/1 [00:00<?, ?it/s]

How can I resolve it?

Thanks!

License?

Please include license file in this repo

How can the demo run with CPU?

hi,my MacBook pro's OS is Big Sur, no matching CUDA Drivers.
when I run the demo inference_webcam.py , error like this:

line 137, in
model = model.cuda().eval()
AssertionError: Torch not compiled with CUDA enabled

How can the demo run with CPU?
thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.