Giter Site home page Giter Site logo

image_classification's Introduction

COMP5328 - Advanced Machine Learning

Assignment 2 - Learning with Noisy Data

Author : Yutong Cao, Chen Chen, Yixiong Fang

Lecturer: Tongliang Liu

Tutors : Nicholas James, Zhuozhuo Tu, Liu Liu

Objectives: This project implements three algorithms based on support vector machine against class-dependent classification noise.

The first algorithm extends an existing Expectation Maximisation method to class-dependent classification noise. The method was based on: Expectation Maximisation by Biggio et al. [2011]

The second algorithm reproduces existing state-of-the-art algorithms which are robust to label noise. The algorithm is based on: Importance Reweighing by Liu and Tao [2016]

We also heuristically propose a `quick and dirty' approach, which we called: 3. Heuristic Approach by Relabelling

This code will compare the performance of these three algorithms on two well-known dataset:

  1. MINIST
  2. CIFAR

Both datasets are injected with label noise.

Reuirements:

  • sklearn == 0.20.0
  • multiprocessing
  • numpy
  • matplotlib
  • densratio

Running Environment Setup

  1. Make sure to put the dataset file mnist_dataset.npz and cifar_dataset.npz into the folder with name input_data under the Code directory. The tree structure of our algorithm is:
project
│   README.md
│   biggio11.pdf
│   ...
└───Code
│   └───algorithm
│   │    │   main.py
│   │    │   util.py
│   │    │   estimate_rho_PCA.py
│   └───input_data
│        │   mnist_dataset.npz
         │   cifar_dataset.npz
└───assignment2
    │   ...
    │   ...
  1. Run main.py in Code/algorithm with the choose dataset and algorithm.

     --dset DSET      Set the dataset to use, 1 = MINIST, 2 = CIFAR. Default is
                    CIFAR.
     --method METHOD  Set the algorithm to run, 1 = Expectation Maximisation, 2 =
                    Importance Reweighting, 3 = Heuristic Approach. Default is
                    'Importance Reweighting'.
    

    For example, to run Expectation Maximisation on MINIST dataset, please run:

    python main.py --dset=1 --method=1
    

    If you do not set the parameter, the default would be running Importance Reweighting on CIFAR.

  2. To run algorithm estimating flip rates rho, please run in Code/algorithm:

    python estimate_rho_PCA.py
    

All results will be auto-saved to result/{generated-time-dataname}.

image_classification's People

Contributors

cc20002002 avatar ycao5602 avatar

Stargazers

 avatar  avatar  avatar

Watchers

James Cloos avatar Hayden Fang avatar  avatar  avatar paper2code - bot avatar

Forkers

ycao5602 junzhin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.