Giter Site home page Giter Site logo

Non-binary masks for CAGS dataset about npfl114 HOT 3 CLOSED

t-ded avatar t-ded commented on July 19, 2024
Non-binary masks for CAGS dataset

from npfl114.

Comments (3)

foxik avatar foxik commented on July 19, 2024

Hi,

yes, the mask values are in 0-255 range, but not necessarily just 0 and 255 -- on the boundaries, there are various shades of gray.

For evaluation, we do binarize the mask (it is required by the metric we use); however, for training the masks do not necessarily need to be binarized -- BinaryCrossentropy can handle any gold distribution. However, they need to be in the 0-1 range.

I do not think the masks should be binarized in the dataset -- these are actually the gold masks, so distributing them "as is" seems to be the best option. Previously (=in previous years), I converted the images and masks to the floats in 0-1 range automatically during loading of the dataset, so there were less problems; however, it requires four times as much memory as keeping the images as bytes, so this year I decided not to do it automatically (to learn people to work with realistic data format) -- that way you can do it only for the images being prepared for training (so the shuffle buffer, for example, can store byte images).

The MaskIoUMetric works as follows:

  • if given floats, it expects them to be in 0-1 range;
  • if given integers, it expects the full non-negative range of the integer type is used to represent the values. So if tf.uint8 is used, the values are expected to be in 0-255 range; if tf.int8 then in 0-127; if tf.uint16 then 0-65535 range.

This guarantees that both the gold uint8 masks and 0-1 float masks work. This conversion is performed by the tf.image.convert_image_dtype method.

But if you binarized the mask to 0/1 with an integral type, then the IoU was not computed correctly -- either use floats, or use 0/255 for bytes. (But because you need 0-1 range for training, the reasonable approach is to use 0-1 floats.)


I still think the current setting is the best possible I know of (when requiring the images to be stored in the dataset as bytes), but if you have any idea what to improve, I will be happy to hear it. (Note that the mask binarization is actually not a problem, it is the range -- so just converting the masks to be either 0 or 255 would not help you with the IoU problem.)

from npfl114.

t-ded avatar t-ded commented on July 19, 2024

Thank you for your explanation to all the issues, it helped clarify many things. I got confused by the following line in tf.keras.losses.BinaryCrossEntropy documentation:

image

from npfl114.

foxik avatar foxik commented on July 19, 2024

Well, it is true that formally the BinaryCrossentropy operates on the Bernoulli distribution, where the true value can be either 0 or 1, so even if I originally wanted to say that the TF documentation is not exact, the documentation is probably fine.

However, the BinaryCrossentropy naturally generalizes (without any change) to the case where the true value is in the [0, 1] range. The interpretation can be that if true value is 0.8, then you are seeing 0.8 example with true value 1 and 0.2 example with a true value 0. The gradient formula (prediction - gold) works fine too; just the loss will not have a minimum of zero (because even if we minimize KL divergence to 0, the crossentropy term will contain the entropy of the gold distribution, which is now not zero for true value not equal to 0, 1). Note that when label smoothing is used, we also use true value not equal neither to 0 nor 1.

from npfl114.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.