Giter Site home page Giter Site logo

chengchunhsu / wsis_bbtp Goto Github PK

View Code? Open in Web Editor NEW
96.0 96.0 14.0 4.78 MB

Implementation of NeurIPS 2019 paper "Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior"

License: MIT License

Dockerfile 0.10% Python 13.03% C++ 83.20% Cuda 0.90% Shell 0.05% MATLAB 1.14% CMake 0.22% C 1.37%

wsis_bbtp's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wsis_bbtp's Issues

Issues with IoU Computation in The Eval Code

Overlap(j,i) = sum(TempProposal & TempGTInst) / sum(TempProposal | TempGTInst);

The code computes the intersection over union (IoU) between the ground-truth mask and the predicted mask.

There are two severe issues in the above line.

  1. Both TempProposal and TempGTInst are matrices (2D tensors), sum(TempProposal & TempGTInst) and sum( TempProposal | TempGTInst) are vectors (1D tensors), which represent the intersection and union of each column of segmentation images. In Matlab, if we use the operation / to connect 2 vectors (let's say vector a divided by vector b), it will output a scalar r, such that |a - b * r| is minimal. However, r is used as IoU by mistake. I didn't see any relation between r and IoU and I check several r values during evaluation, none of them is equals to the correct IoU values.

  2. When debugging this line, I found that Intersection eliminates those ignored pixels but Union does not.

questions with the implementation of the pairwise term

Hi, @chengchunhsu , thanks for your implementation. When I test the model, I found the mask generated by the model is strange, there are some bad points in the mask like that
截屏2020-04-28 下午6 40 21

And I found that, in Sec 3.3, you write that
截屏2020-04-28 下午6 39 35
but in your implementation here, it seems that you multiply each pairwise term with its score value. Maybe this difference causes the mask problem? Please check it.

Thanks, looking forward to your reply

Where do you apply DenseCRF to the score map?

Hello, thank you for sharing your nice project.

In sec3.4, you write about how to get the instance masks, but I do not know where you apply the DenseCRF...

For example,

we first generate the score map for the detected box using the trained segmentation branch.  
Then, the predicted score map of the box is pasted to a map Sˆ of the same size as the input image according to the box’s location.  

is corresponding to the first score map and paste pate

And,values predicted by the model are returned in this part

All the above codes are run in this inference part.

Finally, the encoded masks are outputed in the encode part.

However, I do not know the DenseCRF part...

For setting up DenseCRF, we employ the map Sˆ as the unary term and use the color and pixel location differences with the bilateral kernel to construct the pairwise term.
After optimization using mean field approximation, DenseCRF produces the final instance mask

Please teach us, thank you.

Start training and ended

Hi , I have git clone the repo and start training. But it ended immediately.

2020-04-27 11:30:23,193 maskrcnn_benchmark.utils.model_serialization INFO: module.backbone.body.stem.bn1.weight loaded from bn1.weight of shape (64,)
2020-04-27 11:30:23,193 maskrcnn_benchmark.utils.model_serialization INFO: module.backbone.body.stem.conv1.weight loaded from conv1.weight of shape (64, 3, 7, 7)
2020-04-27 11:30:23,288 maskrcnn_benchmark.data.build WARNING: When using more than one image per GPU you may encounter an out-of-memory (OOM) error if your GPU does not have sufficient memory. If this happens, you can reduce SOLVER.IMS_PER_BATCH (for training) or TEST.IMS_PER_BATCH (for inference). For training, you must also adjust the learning rate and schedule length according to the linear scaling rule. See for example: https://github.com/facebookresearch/Detectron/blob/master/configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml#L14
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
Done (t=0.22s)
creating index...
Done (t=0.22s)
creating index...
Done (t=0.22s)
creating index...
index created!
index created!
index created!
Done (t=0.19s)
creating index...
index created!
2020-04-27 11:30:23,782 maskrcnn_benchmark.trainer INFO: Start training

Nothing left. No errors, no training. Is there any solutions?

Colab Interface?

Hi, @chengchunhsu! Is there a possible implementation of the pipeline that can be used on Colab? And is there any example code from training to ending to see how to rebuild the model with custom data?

Implementation issue

Hello, thank you for sharing your nice project.
In Sec3.3, you write that you use the maximum score value to estimate the probability of the bag being positive as follows

where P(b) = max p ∈ b ˆ S(p) is the estimated probability of the bag being positive, and S(p) is the score value of the map S at position p.

I have checked the loss.py, but I can't find the code which pick up the maximum value.
please teach us, thank you!

Can you share your 'voc_2012_val_cocostyle' file?

As i known, regular voc_cocostyle file only contain bbox annotation[x1,y1,x2,y2] and segmentation annotation are [x1y1,x2,y1,x2,y2,x1y2] style, which can't precisely calculate your map for segmentation as ground truth.

So do you use the 'voc_2012_val_cocostyle' with precisely segmentation annotation?

And would you like to share your file?

Thanks!

How to download and prepare the MS-COCO dataset?

Hello,

thank you for releasing your great work!

  1. I was wondering which MS-COCO dataset to download (the Detection 2016, 2017, or 2018 dataset?) and how I can prepare the dataset such that it contains the 99, 310 selected images? Is there any code that automatically downloads and prepares the MS-COCO detection dataset?

  2. Furthermore, when using MS-COCO as additional training data, is the network first pertained on MS-COCO and then finetuned on PASCAL VOC (like in SDI [17])? or is the network trained just once on a combined dataset (COCO + PASCAL VOC)?

Thank you in advance :)

questions with respect to the negative samples.

Hi, @chengchunhsu , thanks for your implementation, Actually, I have a concern for computing the MIL loss for the negative samples. As said in the original paper, the negative samples are samples with its number equal to that of the positive samples. However, in the code implementation, there is no such balance mechanism,

also, I am concerned about the way that the negative samples are sampled. It seems like they are sampled from the negative proposals who have a low IOU with the ground truth bbox, don't some of the proposals have a higher overlap with the pixels inside the bbox (positive samples)?

Thanks,

How to change the path location to the datasets

I tried to change the path to the "voc_2012_aug_train_cocostyle" dataset by adding the new data path to WSIS_BBTP/maskrcnn_benchmark/configs/paths_catalog.py. But when I do this and then run the bash train_voc_aug.sh file, I end up getting the following error.

Screen Shot 2021-05-07 at 11 04 43 AM

Any idea how to solve this? Please help!

Implementation issue

I used this config [e2e_mask_rcnn_R_101_FPN_4x_voc_coco_aug_cocostyle] to retrain and seems like the model is not trained correctly with the two new losses for mask ?, After training the result for masks are just rectangle masks, i'm wondering if implementation is the same wit
3
17

h the one in your paper?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.