xuebinqin / u-2-net Goto Github PK

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

License: Apache License 2.0

Python 100.00%

computer-vision deep-learning image-background-removal image-processing image-segmentation u-2-net u2net

u-2-net's Issues

People segmentation quality, pretrained model vs other datasets, and comparison

Congratulations for your amazing work with u2net at

quick question, I'm trying to apply u2net to do segmentation of people. I tried your pre-trained model, the one available at https://github.com/NathanUA/U-2-Net , and its pretty good but not as good segmenting people as deeplabv3 for example. However I love u2net because it's faster and uses less memory. So now I'm trying to train u2net with the 64k images of the Coco Dataset. The question is:

Your pretrained model, on what was it pretrained? because if you tell me that it was already pretrained with coco, then no point in continuing to try that route. But if it was pretrained in a different way then maybe it´s worth trying to train it with the 64k images of people in Coco dataset.
Do you think that with the right dataset the u2net model could achieve similar IoU segmentation quality on people as the level reached by deeplabv3?

thank you so very much and congrats again

index error

I tried to train on my image set. But got the following error.
My folder setup
train_images - has 2 folders - 'images ' and another folder 'mask'
When I ran the script it showed the correct number of images, but then got the following error

Traceback (most recent call last):
File "u2net_train.py", line 143, in
loss2, loss = muti_bce_loss_fusion(d0, d1, d2, d3, d4, d5, d6, labels_v)
File "u2net_train.py", line 42, in muti_bce_loss_fusion
print("l0: %3f, l1: %3f, l2: %3f, l3: %3f, l4: %3f, l5: %3f, l6: %3f\n"%(loss0.data[0],loss1.data[0],loss2.data[0],loss3.data[0],loss4.data[0],loss5.data[0],loss6.data[0]))
IndexError: invalid index of a 0-dim tensor. Use tensor.item() in Python or tensor.item<T>() in C++ to convert a 0-dim tensor to a number

IndexError: invalid index of a 0-dim tensor

python3 u2net_train.py

Traceback (most recent call last):
File "u2net_train.py", line 140, in
loss2, loss = muti_bce_loss_fusion(d0, d1, d2, d3, d4, d5, d6, labels_v)
File "u2net_train.py", line 40, in muti_bce_loss_fusion
print("l0: %3f, l1: %3f, l2: %3f, l3: %3f, l4: %3f, l5: %3f, l6: %3f\n"%(loss0.data[0],loss1.data[0],loss2.data[0],loss3.data[0],loss4.data[0],loss5.data[0],loss6.data[0]))
IndexError: invalid index of a 0-dim tensor. Use tensor.item() in Python or tensor.item<T>() in C++ to convert a 0-dim tensor to a number

Unexpected item included in the final mask

Congratulations for your amazing work with u2net！
Recently,I have tried your net on portrait segmentation task.Also I trained it from scratch by portrait segmentation dataset without other objects;
It got great performance.However,I find that sometimes,objects such as chairs and street nameplates can also be included. I am confused about it.
Since I trained it from scratch by the given dataset.It can be seen as a semantic segmentation task,right?Why the other object can be inclued?
Thankyou!

permission denied... any way to go around this?

Can I specify object to be segmented?

Hello, thank you for this work.

This issue is as the following images:

Is there a way to choose object to be segmented? Or how do I keep the guitar along the sequence of images?

Thank you.

Use Salient Object Detection to realize Salient-Instance Segmentation

Is it possible to realize this task?@Nathanua

Is there a docker image?

It would be great to have a docker image for this so people less experienced with Python (me 😛 ) can try it out.

Or maybe just pointing out in the README some already made docker image that would work.

Stalled in reading data

Hi all

I found it always got stalled in reading data. I press Ctrl+C and it print some information as below

程序运行到for i_test, data_test in enumerate(test_salobj_dataloader)总会卡住不动。我按 Ctrl+C 后显示如下

Traceback (most recent call last):
  File "u2net_test.py", line 119, in <module>
    main()
  File "u2net_test.py", line 93, in main
    for i_test, data_test in enumerate(test_salobj_dataloader):
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 974, in _next_data
    idx, data = self._get_data()
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 941, in _get_data
    success, data = self._try_get_data()
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 779, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/multiprocessing/queues.py", line 104, in get
    if not self._poll(timeout):
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/multiprocessing/connection.py", line 257, in poll
    return self._poll(timeout)
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/multiprocessing/connection.py", line 414, in _poll
    r = wait([self], timeout)
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/multiprocessing/connection.py", line 920, in wait
    ready = selector.select(timeout)
  File "/local/mnt/workspace/ruodcui/anaconda3/lib/python3.7/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt

Any ideas?

Using with torch.jit.trace and C++?

I started a discussion here https://discuss.pytorch.org/t/debugging-runtime-error-module-forward-inputs-libtorch-1-4/82415

I modified u2net_test.py and used torch.jit.trace to save a module

traced_script_module = torch.jit.trace(net, inputs_test)
traced_script_module.save("traced_model.pt")
print(inputs_test.size()) # shows (1, 3, 320, 320)

Then in c++

auto module = torch::jit::load("traced_model.pt");
torchinputs.clear();
torchinputs.push_back(torch::ones({1, 3, 320, 320 }, torch::kCUDA).to(at::kFloat)); // because python was torch.FloatTensor
module.forward(torchinputs); // error

The error:

 Unhandled exception at 0x00007FFFD8FFA799 in TouchDesigner.exe: Microsoft C++ exception: std::runtime_error at memory location 0x000000EA677F1B30. occurred

The error is at https://github.com/pytorch/pytorch/blob/4c0bf93a0e61c32fd0432d8e9b6deb302ca90f1e/torch/csrc/jit/api/module.h#L112 It says inputs has size 0. I don't know if that's the cause of the exception or a result.

Do you have advice about running U-2-Net in C++? Thank you.

U-2-Net for binary segmentation

Hey @Nathanua,

just to share my experience here, using the small U-2-Net architecture:

I'm comparing the results to a baseline model (DeepLabV3, ResNet101 ~450MB) and achieved 82,... mIOU after 500 Epochs on my rather small benchmarking Dataset (license plate binary segmentation).
The small U-2-Net model achieved 78,... mIOU (~5MB) after roughly 600 Epochs

I did the following modifications to the model in order to fine-tune my results:

I introduced a distillation loss between each nested Unet and used temperature annealing on the sigmoid. Based on the assumption that the more nested Unets have more computational power, we can define a second loss specifying the bce between adjacent layers. The model converges way faster with this approach, however the 3 new hyperparameters ("temperature", alpha and a scaling factor for the loss) seem to be quite "sensitive". was not working.
I changed to SGD with a poly learning rate starting at 0.01
I'm currently exploring the options to further prune the network or use an early exit while inferencing to reduce inferencing time.

I just wanted to share my experience here, maybe it can be helpful for this repository.

Best,
Dennis

This is a great job. Where is the link to the paper？

Design Questions

Thanks for you amazing work. I learnt alot from your 2 papers. I had a few questions:

Why have you only used Cross entropy loss and not also ssim and iou as in basnet? How are advantages of those losses, which were outlined with detailed analysis in basnet, made up for with only CE here?
Why did you set all the deep-supervision weights to 1? It is normal to set them to values between 0.2-0.8 so the model focuses most on the final output.
There does not seem to be a lr scheduler which is a default today. May i know why you did not use one?
How much did taking all the predications and concatenating them to predict the final map make a difference vs just taking the top most prediction? As a 1X1 conv is used, we are just taking a linear combination of the previous prediction and it is apparent that the later most predictions will be the most accurate. Did you look at the weights learnt by the 1X1 to see what was going on?

Best way to fine-tune the model

Hi everyone!

I'd be very grateful if somebody shared their fine-tuning experience.
How many pictures in the dataset is enough?
Which hyperparameters fit best?
Which layers are better to freeze? Are there any pros in freezing them?
How long does it take to train such a model?

Currently, I try to improve segmenting flowers (the stem of the flowers is often ignored). I collected & labeled 200 images, applied all augmentation techniques possible.
But haven't found the framework to train the model successfully.

Would be appreciative to hear any thoughts of improving it, if you have any :)

Image preprocessing

Closed.

How are masks created ? maybe out of topic

This is from the DUTS Dataset. just out of curiosity how is such a mask created ?

Loss looks strange on training

At training validation loss is bouncing which looks very strange.
Here I attach you a plot with training and validation loss curves:

关于网络输出

您好，感谢您开源自己的研究项目，我遇到了一个问题，在预测的时候产生的二值图，如何在原图中将这个图片扣出来呢，网络输出有轮廓的坐标吗？

How to resume training ?

Hello,

how to resume training on a saved .pth ?

How to increase model capacity for training on a larger dataset?

First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset of 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset.

I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers?

Feature : Only save model when training has improved

it would be great to only save model when training has improved.

prediction confidence score?

Hi Nathan,

Such a great job, much better results than other nets I played with ( including my experimentations training a blank U-Net). Thank you for making it available.

How would you calculate some kind of "prediction confidence" score?
It could be used during inference to flag predictions which require human review.

Thinking on my feet I was considering to experiment with these naive approaches:

Calculate it based on how much is the deviance from 0 and 1 in predicted pixels. Does a prediction closer to 0.5 for a pixel indicates less confidence in whether the pixel is fg or bg?
Calculate losses on a bunch of input images with ground truth. Then based on it train a network to predict loss on any arbitrary image.

Google Colab

Thank you for this code!

Just a thought. There are some Colab's available, like: https://colab.research.google.com/drive/1BDj3eZWL4I-8QtkxAAZxF8in35fvR_xU?usp=sharing

If there's an official Google Colab, it would be nice to have "Open in Colab" button.

U-2-Net model different with your paper description

https://github.com/NathanUA/U-2-Net/blob/master/model/u2net.py

    d1 = self.side1(hx1d)
    
    d2 = self.side2(hx2d)
    d2 = _upsample_like(d2,d1)

    d3 = self.side3(hx3d)
    d3 = _upsample_like(d3,d1)

    d4 = self.side4(hx4d)
    d4 = _upsample_like(d4,d1)

    d5 = self.side5(hx5d)
    d5 = _upsample_like(d5,d1)

    d6 = self.side6(hx6)
    d6 = _upsample_like(d6,d1)

    d0 = self.outconv(torch.cat((d1,d2,d3,d4,d5,d6),1))

    return F.sigmoid(d0), F.sigmoid(d1), F.sigmoid(d2), F.sigmoid(d3), F.sigmoid(d4), F.sigmoid(d5), F.sigmoid(d6)

generates six side output saliency probability maps from stages En6, De5, De4, De3, De2 and De1 by a 3x3 convolution layer, without sigmoid function.
But in your paper description，generates six side output saliency probability maps from stages En6, De5, De4, De3, De2 and De1 by a 3x3 convolution layer and a sigmoid function.

Why the difference?

RuntimeError: expected dtype Half but got dtype Long

I am trying to use this model for binary segmentation.

When i pass the Mask as Tensor to muti_bce_loss_fusion I get this error:

    547 def muti_bce_loss_fusion(d0, d1, d2, d3, d4, d5, d6, labels_v):
    548     print(d0.shape)
--> 549     loss0 = bce_loss(d0, labels_v)
    550     loss1 = bce_loss(d1, labels_v)
    551     loss2 = bce_loss(d2, labels_v)

~/anaconda3/envs/seg/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    556             result = self._slow_forward(*input, **kwargs)
    557         else:
--> 558             result = self.forward(*input, **kwargs)
    559         for hook in self._forward_hooks.values():
    560             hook_result = hook(self, input, result)

~/anaconda3/envs/seg/lib/python3.7/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
    518 
    519     def forward(self, input, target):
--> 520         return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
    521 
    522 

~/anaconda3/envs/seg/lib/python3.7/site-packages/torch/nn/functional.py in binary_cross_entropy(input, target, weight, size_average, reduce, reduction)
   2415 
   2416     return torch._C._nn.binary_cross_entropy(
-> 2417         input, target, weight, reduction_enum)
   2418 
   2419 

RuntimeError: expected dtype Half but got dtype Long

What is the format of the output of the model? What is the expected format for the label?

How could I use this model for binary segmentation?

About inference speed

Thanks for your great job!
From paper I know that U2Net runs at a speed of 30 FPS, with input size of 320×320×3) on a 1080Ti, U2-Net+(4.7 MB) runs at 40 FPS, but on which GPU U2Net running? 1080Ti?

Architecture Insight

I find it very interesting how the model is able to pick up tiny gaps in salient objects as well as segment out delicate hairs like the ones shown below

I am surprised at how it can do this at such a low 320X320 resolution input and while the paper does motivate the architecture by saying it allows the model to repeatedly get a global view, it doesn't quite explain how this level of detail can be reached at this resolution. I don't quite understand how it is able to detect minor gaps within the object of interest and trace it so well. It is able to provide a more fine mask for human hairs than even networks trained on humans which also fail to pick up other minute details. I would appreciate if the author could provide his thoughts and some more insights into the working of the model because I have never seen any segmentation network being able to segment such details. Any comments would be much appreciated.

About RescaleT

Xuebin Qin, 您好：
感谢您的分享，真是一项很棒的工作。有一点小疑问想向您请教下，请问您train和test的时候为什么直接把图片resize到(320,320)呢？而不是保持图片的长宽比呢？是跟训练数据有关系还是如果保持长宽比的话，RandomCrop会影响图片中显著性物体的完整性？还是其他什么原因呢？我想训练自己的数据，我的应用场景是1920*1080分辨率的视频，并有一些该分辨率下的训练数据，请问您觉得train和test的size应该怎样调整最合适呢？多谢您的指点！

Epoch size

Hi Nathan,
Your results are fantastic and thank you for sharing the code but it would be really appreciated if you can kindly answer a few queries that I have from my end. I would like to now the following things :

What was the max epoch size that you trained the model with Duts-Dataset and how did u get to the tar loss of 0.01?

How to capture more fine details

First of all, great work! The model is able to capture some fine details such as human hair. However, it is not capable of capturing the majority of the finer details such as little holes and hair. I tried to add the iou loss function and the model becomes very confident on predicting the object edges. However, the little holes/gaps in human hair are not captured and a lumped area with solid foreground will be predicted.

Do you have any suggestion as to modifying the model to capture more fine details of the objects?

How to change Input/Output image dimension from 320x320 to 640x640

Dear Nathan ,
I hope you are doing well . Your results are really stunning thank you for sharing the project . It would be deeply appreciated if you can kindly answer the following question for me .

I want to change the model input image and output prediction size from 320x320 to 640x640 . Can you please guide me as to how I can get this done .

Thanks a lot

Kinds Regards
Kamal Kanta Maity

Loss & accuracy

I'm trying to retrain your model for our specific use case. I'm training with images augmented from a 30k set. I also added accuracy calculations and validations.

The loss and accuracy seems to stall no matter how I change the learning rate.
What would you recommend? Should I just train longer ? Shall I try to "freeze" or lower the LR on part of the layers (which layers? all encoders?). Or is that how far it can get?
Have you experimented with different LR algos (Cyclic etc.)?

I ran these with 120k training images (50 epochs , 200 Iterations each, batch size 12 ). Validation size: 600 images after each epoch

Training from scratch

Training on your pre-trained model (173.6 MB) LR=0.001 (as yours)

LR reduced to 0.0001 (on pre-trained model)

Higher resolutions

First of all, thank you for the excellent work.

I tried to infer some images with higher resolution (around 2000 x 1500) and the mask generated seems blurry on the edges.
Do you think it's because of the resolution of the images used for training ?
Do you think that by training the u-net with higher resolution images, I would get better results?
Do you know if a library similar to DUTS exist with higher resolution images?

Thank you!

evaluation code

Hi, thanks for your great work.
could you please share your evaluation code?

Wrong results with simple test

Hello,

By doing tests, I created a model with only 1 image and 10 epoq and I tested the model with this same image, to get bad results in the end. Do you have any explanations?

Thank you

CRF post processing

Hi,

First of all i want say thanks a lot for your work. Its really one of the best model for image segmentation i have faced so far. Output may be little less than few others but no comparison of speed for this model. It super fast.

I am trying to train this model on human images and waiting for he output. However curious about CRF post processing.

As while using pre-trained model output edges are not that sharp. So thinking post processing may help on that.

Can you please suggest how to do that.

cannot download weights

I cannot download file from google drive.
Can you provide download website of Baidu?
Thanks a lot!

Reproducing results

I am trying to reproduce your results but having some uninspiring signs at the start. I start with your model with all the settings as you stated in the paper except that I use 15% of the train data as validation every epoch and my batch size is 8. The validation loss after 50 epochs or so stops decreasing and there emerges a noticeable gap between train and validation. I trained for more 40 epochs but validation did not fall lower, it is almost twice of that of the train loss.

The model seems to be overfitting to me. A lower batch-size than yours should cause more regularisation so that should not be the issue.

Can you please give me some advice on how to interpret this and if I should keep going? I know i am not using 100% of data like you but 85% should be suboptimial but similar. Can you share your training curves or anything as such?

training data?

I'm interested to retrain the model from scratch. It looks like the code expects a train_datafolder, which doesn't exist. The readme mentions a bunch of datasets that you've trained on, but it seems like the code expects a different format than the one available e.g. here.

Can you please confirm that the expectation is jpg images as input, and a png encoded in uint8 where the entire object takes the value 255 and all other pixels are 0?

Inconsistency between paper and code

Hi,

first thank you so much for the great work! I have found one issue regarding the mismatch of your paper and the released code while I was trying to integrate the code into my own project:

It is in the part "3.2. Architecture of U2-Net" in the paper where you wrote:
"our U2-Net first generates six side output saliency probability maps S(6), S(5), S(4), S(3), S(2), S(1) from stages En 6, De 5, De 4, De 3, De 2 and De 1 by a 3 x 3 convolution layer and a sigmoid function. Then, it upsamples these saliency maps to the input image size and fuses them with a concatenation operation followed by a 1 x 1 convolution layer and a sigmoid function to generate the final saliency probability map S(fuse)."

But in your code, it seems that you concatenate the upscaled side saliency maps before passing them into sigmoid function:

    d0 = self.outconv(torch.cat((d1,d2,d3,d4,d5,d6),1))

    return F.sigmoid(d0), F.sigmoid(d1), F.sigmoid(d2), F.sigmoid(d3), F.sigmoid(d4), F.sigmoid(d5), F.sigmoid(d6)

Could you please help clarify what is the correct order? Thank you.

Problems related to the training set

First，appreciate your excellent work！In the training set, I need a little guidance.
In BASNet, you detailed the composition of the training set，In this job, did you use the same training set during training？

Waiting for the code on the edge my seat :-)

Hi,

I'm a great fan of your work so far. Your readme states that the code will be released May 4th, has the timeline changed?

Thanks!

Fine tune the existing Model

Hi @Nathanua

I want to fine tune the existing model(u2net). I understand that we can resume the training simply as discussed in this issue.
#33

if(model_name=='u2net'):
    net = U2NET(3, 1)
elif(model_name=='u2netp'):
    net = U2NETP(3,1)
net.load_state_dict(torch.load(saved_model_dir))

if torch.cuda.is_available():
    net.cuda()

In addition, to resume the training from where exactly it was, one usually needs to save and load the optimizer (especially for Adam)

So my question is In addition to weights, how can I load the optimizer state as well ?

How do I need to change the model to get it to learn depth information.

Hi,
I wanted to know how should I change the model configuration for it to learn depth information too. I have read that adding altrous convolutions may help the model to learn depth information. How can I do that?

For the below RSU of length 7 here, how can I add more dilated convolutions?

class RSU6(nn.Module):#UNet06DRES(nn.Module):

    def __init__(self, in_ch=3, mid_ch=12, out_ch=3):
        super(RSU6,self).__init__()

        self.rebnconvin = REBNCONV(in_ch,out_ch,dirate=1)

        self.rebnconv1 = REBNCONV(out_ch,mid_ch,dirate=1)
        self.pool1 = nn.MaxPool2d(2,stride=2,ceil_mode=True)

        self.rebnconv2 = REBNCONV(mid_ch,mid_ch,dirate=1)
        self.pool2 = nn.MaxPool2d(2,stride=2,ceil_mode=True)

        self.rebnconv3 = REBNCONV(mid_ch,mid_ch,dirate=1)
        self.pool3 = nn.MaxPool2d(2,stride=2,ceil_mode=True)

        self.rebnconv4 = REBNCONV(mid_ch,mid_ch,dirate=1)
        self.pool4 = nn.MaxPool2d(2,stride=2,ceil_mode=True)

        self.rebnconv5 = REBNCONV(mid_ch,mid_ch,dirate=1)

        self.rebnconv6 = REBNCONV(mid_ch,mid_ch,dirate=2)

        self.rebnconv5d = REBNCONV(mid_ch*2,mid_ch,dirate=1)
        self.rebnconv4d = REBNCONV(mid_ch*2,mid_ch,dirate=1)
        self.rebnconv3d = REBNCONV(mid_ch*2,mid_ch,dirate=1)
        self.rebnconv2d = REBNCONV(mid_ch*2,mid_ch,dirate=1)
        self.rebnconv1d = REBNCONV(mid_ch*2,out_ch,dirate=1)

Should we save running_loss and running_tar_loss also to model to resume training ?

Hello,

currently am able to resume training but just wanted to ask a quick question,
should i also save running_loss and running_tar_loss to my checkpoint model ?

because when i resume training the values go back to 0.

input size and crop

Thanks a lot for you awesome performing model!
I'm wondering about scaling and random crop, for training you first scale and then crop to 288x288 and thus the tensor has this size (288), what role does then scaling play here and why you talk about 320x320 as input size instead of 288x288?

RescaleT(320), 
RandomCrop(288),

With your latest model update, upscaling supports different ratios, as it looks like for me, or is only squarish input supported or e.g. 640x480 as well?

Input Size

Hi Team,
Great work!!!
I have a doubt is the input size fixed with 320*320?

RunetimeError - CPU only

Can't seem to get it to work due to:
"Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU."

However i do not see where i should use torch.load with map_location='cpu'.

full error:
Traceback (most recent call last):
File "u2net_test.py", line 116, in
main()
File "u2net_test.py", line 86, in main
net.load_state_dict(torch.load(model_dir))
File "./U2NET/venv/lib/python3.6/site-packages/torch/serialization.py", line 529, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "./U2NET/venv/lib/python3.6/site-packages/torch/serialization.py", line 702, in _legacy_load
result = unpickler.load()
File "./U2NET/venv/lib/python3.6/site-packages/torch/serialization.py", line 665, in persistent_load
deserialized_objects[root_key] = restore_location(obj, location)
File "./U2NET/venv/lib/python3.6/site-packages/torch/serialization.py", line 156, in default_restore_location
result = fn(storage, location)
File "./U2NET/venv/lib/python3.6/site-packages/torch/serialization.py", line 132, in _cuda_deserialize
device = validate_cuda_device(location)
File "./U2NET/venv/lib/python3.6/site-packages/torch/serialization.py", line 116, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Technical Issue On Resuming Training

Hello,

I have trained my model on DUTS at 300 iterations and saved my model state together with adams . Now i removed the DUTS images to add new images so that the model can continue training on these new images. Meaning, the model will have had trained on DUTS and also these new images.

But unfortunately , it seems the model overrides the previous trained images and starts training on the new images from scratch . so when i load any DUTS image the results are not good at all.

Is this the intended behavior ? or am supposed to mix all images and continue training , old and new ones in one folder ?

RuntimeWarning: invalid value encountered in true_divide

Your work is so great, thank you for sharing your code!

I tried to inference some images using your model and your code.
Almost everything is good, but with some images, I receive warning:
data_loader.py:197: RuntimeWarning: invalid value encountered in true_divide
image = image/np.max(image)
Like this image: https://drive.google.com/file/d/1iFTb29lu3cWQzrMMdMB3y03Fcoqd7Gkg/view?usp=sharing

I do not know why this happen? I mean what happens in data_loader.py file give this warning. Could warning affect the quality of result?

xuebinqin / u-2-net Goto Github PK

u-2-net's Issues

Training from scratch

Training on your pre-trained model (173.6 MB) LR=0.001 (as yours)

LR reduced to 0.0001 (on pre-trained model)

Recommend Projects

Recommend Topics

Recommend Org