Giter Site home page Giter Site logo

sigmoid-classifier's Introduction

Sigmoid Classifier

The sigmoid classifier is a classifier that sigmoid output as activation function.

A typical classification model has a softmax avtivation in the last layer and is trained through the CCE loss function.

It's very good and we're using it a lot.

So why use the sigmoid activation when there is a very good combination of softmax + CCE?

Unknown class

Most classification problems do not have an unknown class.

An unknown class is a class that does not correspond to any of the classes you want to classify.

It is also called no_obj or not_obj.

We found that classifiers using sigmoid showed better performance in classification problems with unknown classes.

The training label on the sigmoid classifier does not contain classes that distinguish between unknown classes.

However, it trains all classes by giving them zero.

Let me give you an example.

The label for the softmax classifier is one-hot-vector, which gives 1 to the index pointing to that class and 0 to none.

class_2 (of 3 classes) = [0, 0, 1]

If you are adding an unknown class in this state, the label is as follows:

unknown (of 3 classes) = [0, 0, 0, 1]

On the other hand, the sigmoid classifier label method is as follows:

class_2 (of 3 classes) = [0, 0, 1]

unknown (of 3 classes) = [0, 0, 0]

Softmax cannot be used because there exists a label that does not contain 1.

And it is replaced by sigmoid.

Perspective of probability

The output of a classifier using softmax is often interpreted as the probability that it is a corresponding class.

But I always doubted it. Does this really mean anything as a probability?

Suppose you have a softmax classification model that classifies five classes.

And because of the last softmax layer, the sum of the output values for this model is 1.


Because of exp, the larger value of the output will receive more weight and the smaller value will be smaller.

This is strange.

It's as if they're manipulating the value to make it seem more certain.

On the other hand, I think the values that the models trained with sigmoid and BCE output are reliable from a probabilistic perspective.

This is because the probability of being that class for each class is the same as being a logistic regression.

Am I wrong?

I have made and tested many kinds of classification models.

The classification model using softmax seemed to be better trained.

This is because train acc and val acc were more stable during training.

However, when testing actual unseened data,

The classification model using sigmoid gave me a better result.

And the output of the model seemed more reliable from a probabilistic point of view.

Am I missing something?

Anyone can share your opinion on this.

sigmoid-classifier's People

Contributors

inzapp avatar giukim avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.