Giter Site home page Giter Site logo

Comments (8)

halbielee avatar halbielee commented on May 28, 2024 3

Thank you for your interest of our work!
I found that there was controversy until the code was released.
Now the code is open :)

Here I leave my opinion.

@TyroneLi @stickyfiner

I agree that using a saliency map for the weakly supervised semantic segmentation (WSSS) can feel unfair. However, as @bityangke said, a lot of works in WSSS use the saliency map for more accurate pseudo-mask. Although most of them use the saliency map as a background cue for the pseudo-mask, the fact that they use the saliency map does not change.

Then, you may think of our method as just a way to make good use of the saliency map.

However, we identify the three challenges of WSSS (sparse object coverage, inaccurate object boundaries, and co-occurrence problem) which existing works could not solve them all at once in the paper. Even the methods that use the saliency map in training phase do not solve these problems.

For this, we focused on the complementary relationship between the localization map (CAM) and the saliency map (The localization map can distinguish different objects but does not separate their boundaries and the saliency map provides rich boundary information but does not reveal object identity) and devised a way to utilize the both information.

We show that our method is effective for alleviating the three challenges with extensive experiments.
Additionally, our method does not simply improve the performance only up to the quality of the saliency map. We found that our method can synergize the localization map and the saliency map - we observe that noisy and missing information of each other is complemented via our joint training strategy.

So, we think that our method is more than just using the saliency map in the training process.

@bityangke
Thank you for the sound discussion and your thoughtful comments.

Please see our paper for more detail!
We provide supplementary for more experiments as well.
Paper link

from eps.

 avatar commented on May 28, 2024

I also have the same question.

from eps.

bityangke avatar bityangke commented on May 28, 2024

Plenty of papers used saliency as supervision. You shall carefully read these papers to find why.
I think your words are very impolite.

from eps.

 avatar commented on May 28, 2024

@bityangke They only use saliency map as the background cues which cannot be involved in supervised training. Please provide some papers you mentioned. Thanks.

from eps.

bityangke avatar bityangke commented on May 28, 2024

Why cannot use sal as supervison? If we have computed the saliency, why not use as supervision?
Whether it's used as background cues or supervision, it's the same.
Both use the computed saliency map in the tuning of the model

from eps.

TyroneLi avatar TyroneLi commented on May 28, 2024

@bityangke They only use saliency map as the background cues which cannot be involved in supervised training. Please provide some papers you mentioned. Thanks.

I cannot agree more with you! By the way, the usage of saliency map as supervision is not elegant, because I cannot see any interesting insights of this.

from eps.

TyroneLi avatar TyroneLi commented on May 28, 2024

Thank you for your interest of our work!
I found that there was controversy until the code was released.
Now the code is open :)

Here I leave my opinion.

@TyroneLi @stickyfiner

I agree that using a saliency map for the weakly supervised semantic segmentation (WSSS) can feel unfair. However, as @bityangke said, a lot of works in WSSS use the saliency map for more accurate pseudo-mask. Although most of them use the saliency map as a background cue for the pseudo-mask, the fact that they use the saliency map does not change.

Then, you may think of our method as just a way to make good use of the saliency map.

However, we identify the three challenges of WSSS (sparse object coverage, inaccurate object boundaries, and co-occurrence problem) which existing works could not solve them all at once in the paper. Even the methods that use the saliency map in training phase do not solve these problems.

For this, we focused on the complementary relationship between the localization map (CAM) and the saliency map (The localization map can distinguish different objects but does not separate their boundaries and the saliency map provides rich boundary information but does not reveal object identity) and devised a way to utilize the both information.

We show that our method is effective for alleviating the three challenges with extensive experiments.
Additionally, our method does not simply improve the performance only up to the quality of the saliency map. We found that our method can synergize the localization map and the saliency map - we observe that noisy and missing information of each other is complemented via our joint training strategy.

So, we think that our method is more than just using the saliency map in the training process.

@bityangke
Thank you for the sound discussion and your thoughtful comments.

Please see our paper for more detail!
We provide supplementary for more experiments as well.
Paper link

Actually, I still hold my own opinion as most people. I agree most wsss paper adopt saliency map to estimate background cues, however, they only use this as final postprocessing which shares the similar method as CRF. If someone uses saliency map to supervise training, that will inevitably provide 'explicit full mask labeling' to the network. They could adopt any SOTA saliency method to obtain accurate saliency maps for voc benchmark. So could you tell us what's the difference between human-labeling voc and saliency maps mask. Can you list any other works that use saliency map as training supervision?? The issues (sparse object coverage, inaccurate object boundaries, and co-occurrence problem) existing at wsss, you could leverage other strategies to alleviate, but cannot introduce saliency maps as supervision. I think this is like mixing validation set to training set, your reported results are truly not fair. The first and the most important starting point of yours is not convincing.

from eps.

halbielee avatar halbielee commented on May 28, 2024

Dear @TyroneLi

  1. As you said if someone used a saliency map with better performance when training the network, the network could predict or generate better localization maps (CAM). But this is the same when using the saliency map as a background cue. Actually, papers in WSSS which used the saliency map as a background cue do not adopt the same saliency map and there are enough performance gaps between the saliency detectors. We concerned this point and conducted on the saliency detectors used in OAA.

  2. Saliency map can be explicit supervision for the segmentation task, but the saliency map is not perfect as a ground-truth for the segmentation task. We call this kind of supervision "weak supervision". The saliency map does not have any distinction in classes and only has foreground and background. In addition, the foreground does not coincide with the target classes in the dataset. Finally, the saliency map is noisy itself, so it is not appropriate to directly use it.

  3. Joint learning of saliency detection and weakly supervised semantic segmentation, and Saliency guided self-attention network for weakly and semi-supervised semantic segmentation use saliency map as training supervision.

Since the saliency map is stronger supervision than an image-level label, you might be unpleasant when using the saliency map for the weakly supervised semantic segmentation task using the image-level label. However, we just figured out that the saliency map could be used as additional supervision for WSSS and our method could resolve the three problems simultaneously. I think this is a tiny step for better research. Thant's all. Next time, one of us or other researchers might solve the problems without using stronger supervision such as saliency map. This can be another step.

I will take your opinions and advice carefully and I hope I do better research that convinces more researchers.

- Seungho Lee

Thank your opinions, @bityangke @stickyfiner, as well.

from eps.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.