Giter Site home page Giter Site logo

Comments (3)

albahnsen avatar albahnsen commented on September 8, 2024

Hi.

If you're assuming a constant cost between errors, it is the same doing a balancing of the input dataset than adjusting the threshold doing the cost.
However, if the costs are example-dependent, balancing the dataset does not give you optimal results.

Edit: I've done some Cross Validation to check different C and max_iter but it seems like the best savings score I can get it 0 (with the worst being -12).
That look quite suspicious.

from costsensitiveclassification.

S-C-H avatar S-C-H commented on September 8, 2024

Thanks for the response @albahnsen !:)
Can I confirm the columns of the cost - matrix are?
false positives, false negatives, true positives and true negatives

When I print out the model history (view of the iterations), it suggests the cost per example for the best model is: $0.805161. However, when I manually get the savings score
cost, cost_base, savings_p = savings_score(y_vec, train_predictions, cost_mat)

The cost per example much higher at the cost per alert and the model predicts all fraud.

C= 1.0 - no regularization because I was suspicious about the loss function.

from costsensitiveclassification.

S-C-H avatar S-C-H commented on September 8, 2024

The cost-matrix and loss function appear fine so the problem is with the optimisation of the function.

Now the reason I suggested downsampling is because whereas you had 0.5% true fraud in your example, my example is more like 0.05% or worse. =( Therefore the optimisation tends to converse to predicting a single class. This is not ideal.

from costsensitiveclassification.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.