chengchunhsu / wsis_bbtp Goto Github PK
View Code? Open in Web Editor NEWImplementation of NeurIPS 2019 paper "Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior"
License: MIT License
Implementation of NeurIPS 2019 paper "Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior"
License: MIT License
It is already one month since the repo is built.
WSIS_BBTP/matlab/EvalVOCInstSeg.m
Line 75 in 03cb87b
The code computes the intersection over union (IoU) between the ground-truth mask and the predicted mask.
There are two severe issues in the above line.
Both TempProposal
and TempGTInst
are matrices (2D tensors), sum(TempProposal & TempGTInst)
and sum( TempProposal | TempGTInst)
are vectors (1D tensors), which represent the intersection and union of each column of segmentation images. In Matlab, if we use the operation /
to connect 2 vectors (let's say vector a divided by vector b), it will output a scalar r, such that |a - b * r|
is minimal. However, r
is used as IoU by mistake. I didn't see any relation between r and IoU and I check several r
values during evaluation, none of them is equals to the correct IoU values.
When debugging this line, I found that Intersection eliminates those ignored pixels but Union does not.
Hi, @chengchunhsu , thanks for your implementation. When I test the model, I found the mask generated by the model is strange, there are some bad points in the mask like that
And I found that, in Sec 3.3, you write that
but in your implementation here, it seems that you multiply each pairwise term with its score value. Maybe this difference causes the mask problem? Please check it.
Thanks, looking forward to your reply
Hello, thank you for sharing your nice project.
In sec3.4, you write about how to get the instance masks, but I do not know where you apply the DenseCRF...
For example,
we first generate the score map for the detected box using the trained segmentation branch.
Then, the predicted score map of the box is pasted to a map Sˆ of the same size as the input image according to the box’s location.
is corresponding to the first score map and paste pate
And,values predicted by the model are returned in this part
All the above codes are run in this inference part.
Finally, the encoded masks are outputed in the encode part.
However, I do not know the DenseCRF part...
For setting up DenseCRF, we employ the map Sˆ as the unary term and use the color and pixel location differences with the bilateral kernel to construct the pairwise term.
After optimization using mean field approximation, DenseCRF produces the final instance mask
Please teach us, thank you.
Hi , I have git clone the repo and start training. But it ended immediately.
2020-04-27 11:30:23,193 maskrcnn_benchmark.utils.model_serialization INFO: module.backbone.body.stem.bn1.weight loaded from bn1.weight of shape (64,)
2020-04-27 11:30:23,193 maskrcnn_benchmark.utils.model_serialization INFO: module.backbone.body.stem.conv1.weight loaded from conv1.weight of shape (64, 3, 7, 7)
2020-04-27 11:30:23,288 maskrcnn_benchmark.data.build WARNING: When using more than one image per GPU you may encounter an out-of-memory (OOM) error if your GPU does not have sufficient memory. If this happens, you can reduce SOLVER.IMS_PER_BATCH (for training) or TEST.IMS_PER_BATCH (for inference). For training, you must also adjust the learning rate and schedule length according to the linear scaling rule. See for example: https://github.com/facebookresearch/Detectron/blob/master/configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml#L14
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
Done (t=0.22s)
creating index...
Done (t=0.22s)
creating index...
Done (t=0.22s)
creating index...
index created!
index created!
index created!
Done (t=0.19s)
creating index...
index created!
2020-04-27 11:30:23,782 maskrcnn_benchmark.trainer INFO: Start training
Nothing left. No errors, no training. Is there any solutions?
Hi, @chengchunhsu! Is there a possible implementation of the pipeline that can be used on Colab? And is there any example code from training to ending to see how to rebuild the model with custom data?
Hello, thank you for sharing your nice project.
In Sec3.3, you write that you use the maximum score value to estimate the probability of the bag being positive as follows
where P(b) = max p ∈ b ˆ S(p) is the estimated probability of the bag being positive, and S(p) is the score value of the map S at position p.
I have checked the loss.py, but I can't find the code which pick up the maximum value.
please teach us, thank you!
As i known, regular voc_cocostyle file only contain bbox annotation[x1,y1,x2,y2] and segmentation annotation are [x1y1,x2,y1,x2,y2,x1y2] style, which can't precisely calculate your map for segmentation as ground truth.
So do you use the 'voc_2012_val_cocostyle' with precisely segmentation annotation?
And would you like to share your file?
Thanks!
Hello,
thank you for releasing your great work!
I was wondering which MS-COCO dataset to download (the Detection 2016, 2017, or 2018 dataset?) and how I can prepare the dataset such that it contains the 99, 310 selected images? Is there any code that automatically downloads and prepares the MS-COCO detection dataset?
Furthermore, when using MS-COCO as additional training data, is the network first pertained on MS-COCO and then finetuned on PASCAL VOC (like in SDI [17])? or is the network trained just once on a combined dataset (COCO + PASCAL VOC)?
Thank you in advance :)
@chengchunhsu hi, can this code train without distributed
Hi, @chengchunhsu , thanks for your implementation, Actually, I have a concern for computing the MIL loss for the negative samples. As said in the original paper, the negative samples are samples with its number equal to that of the positive samples. However, in the code implementation, there is no such balance mechanism,
also, I am concerned about the way that the negative samples are sampled. It seems like they are sampled from the negative proposals who have a low IOU with the ground truth bbox, don't some of the proposals have a higher overlap with the pixels inside the bbox (positive samples)?
Thanks,
I tried to change the path to the "voc_2012_aug_train_cocostyle" dataset by adding the new data path to WSIS_BBTP/maskrcnn_benchmark/configs/paths_catalog.py. But when I do this and then run the bash train_voc_aug.sh file, I end up getting the following error.
Any idea how to solve this? Please help!
I used this config [e2e_mask_rcnn_R_101_FPN_4x_voc_coco_aug_cocostyle] to retrain and seems like the model is not trained correctly with the two new losses for mask ?, After training the result for masks are just rectangle masks, i'm wondering if implementation is the same wit
h the one in your paper?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.