Giter Site home page Giter Site logo

Comments (2)

cgnorthcutt avatar cgnorthcutt commented on May 21, 2024

Yes. Cleanlab does this automatically. If your classifier is appropriate for the data (reasonably high cross validation accuracy), it will include any image that is not labeled correctly in the error set, regardless of the underlying true class.

This is easy to test. Add an image of the letter A into the MNIST digit dataset and see if it identified the letter as an error.

If your dataset contains lots of birds or if your model isn't very good (like if you use a simple naive Bayes on image data), then your model won't be able to train well enough on cats and dogs to have good predicted probabilities, which will affect cleanlab's ability to find the errors. But for the most part, if you use a reasonable model that has high accuracy (> 90 percent) on the data when it's clean and the data has no more than 40 percent error, cleanlab will find most of the birds in your cat / dog set.

from cleanlab.

offchan42 avatar offchan42 commented on May 21, 2024

Thank you. This answered my question!

from cleanlab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.