Giter Site home page Giter Site logo

pencil's Introduction

PENCIL.pytorch

PyTorch implementation of Probabilistic End-to-end Noise Correction for Learning with Noisy Labels, CVPR 2019.

Requirements:

  • python3.6
  • numpy
  • torch-0.4.1
  • torchvision-0.2.0

Usage

  • On CIFAR-10, we retained 10% of the CIFAR-10 training data as the validation set and modify the original correct labels to obtain different noisy label datasets.
  • So the validation set is part of data_batch_5, and both of them have 5000 samples
  • Add symmetric noise on CIFAR-10: python addnoise_SN.py
  • Add asymmetric noise on CIFAR-10: python addnoise_AN.py
  • PENCIL.py is used for both training a model on dataset with noisy labels and validating it

options

  • b: batch size
  • lr: initial learning rate of stage1
  • lr2: initial learning rate of stage3
  • alpha: the coefficient of Compatibility Loss
  • beta: the coefficient of Entropy Loss
  • lambda1: the value of lambda
  • stage1: number of epochs utill the end of stage1
  • stage2: number of epochs utill the end of stage2
  • epoch: number of total epochs to run
  • datanum: number of train dataset samples
  • classnum: number of train dataset classes

The framework of PENCIL

## The proportion of correct labels on CIFAR-10

The results on real-world dataset Clothing1M

# method Test Accuracy (%)
1 Cross Entropy Loss 68.94
2 Forward [1] 69.84
3 Tanaka et al. [2] 72.16
4 PENCIL 73.49

Citing this repository

If you find this code useful in your research, please consider citing us:

@inproceedings{PENCIL_CVPR_2019,
author = {Kun, Yi and Jianxin, Wu},
title = {{Probabilistic End-to-end Noise Correction for Learning with Noisy Labels}},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2019}
}

Reference

[1] Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. Making deep neural networks robust to label noise: A loss correction approach. In CVPR, pages 1944–1952, 2017.
[2] Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, and Kiyoharu Aizawa. Joint optimization framework for learning with noisy labels. In CVPR, pages 5552–5560, 2018.

pencil's People

Contributors

yikun2019 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pencil's Issues

Question about entropy loss

In the paper, entropy loss "helps avoid the training from being stalled in our PENCIL framework, because the label distribution is not going to be a one-hot distribution and then f(x;\theta) will be different from y^d"
In the code, entropy loss is same with the definition \Sigma -p * log p in PENCIL.py line 355.

However, when you minimize entropy loss, it "force the network to peak at only one category rather than being flat because the one-hot distribution has the smallest possible entropy value" as written in just a few phrase above.

So, I am confused if you want to minimize entropy loss or not as your intention is opposite to entropy minimization though your final loss function reduces entropy loss.
Is there anything I am wrong with?

Changing the number of classes

I am traying to use the PENCIL framework to train an custom model on a custom dataset, with 5 classes. When I run the model, I get the error
Traceback (most recent call last): File "PENCIL.py", line 512, in <module> main() File "PENCIL.py", line 299, in main train(trainloader, model, criterion, optimizer, epoch, y) File "PENCIL.py", line 353, in train new_y[index, :] = onehot ValueError: shape mismatch: value array of shape (16,10) could not be broadcast to indexing result of shape (16,5)

The creation of the onehot tensor is given in line 351:
onehot = torch.zeros(target.size(0), 10).scatter_(1, target.view(-1, 1), 10.0)

Should I change only the 1st 10 in the code, and keep the "10.0", or change that to "5.0". What exactly does this line do?

KeyError: 'preact_resnet32'

I run PENCL model on cifar-10 datasets. the "--arch" uses preact_resnet32 as backbone network.
when running.....:

File "PENCIL.py", line 221, in main
model = resnet.dictargs.arch
KeyError: 'preact_resnet32'

so, is this code missing a classifier? How do you deal with this problem?
my keras==2.6.0 keras-resnet==0.2.0 tensorflow==2.6.0

The KL divergence is error!

We run PENCL model on cifar-10 datasets. the "--arch" uses resnet-18 as backbone network.

when running.....:

File "PENCIL-master/PENCIL.py", line 355, in train
lc = torch.mean(softmax(output)*(logsoftmax(output)-torch.log((last_y_var))))
RuntimeError: The size of tensor a (1000) must match the size of tensor b (10) at non-singleton dimension 1

output size: (128,1000). last_y_var size: (128,10)
so, is this code missing a classifier?

Would PENCIL decrease the label accuracy?

Hi, I'm trying to re-implement your work.

I'm using ResNet-32 as backbone follow the paper's but without pretrained model parameters. Other hyper-parameters follow the paper suggession in Section 4.2 and listed below. The net was training on CIFAR-10 dataset with 10% asymmeric noise. However, experimental result was far from Paper's. Final Model best top1 accuracy is 84.02%. The accuracy of data grew very rapidly but soon began to decline and fell below the accuracy of the original labels.

Are there any tricks that didn't get written into the paper or that I ignored?Thanks.

Here is the plot of label accuracy changing
An10

And hyper-parameter was shown as follows:

  • batch_size: 128
  • lr: 0.06
  • lr2: 0.2
  • alpha: 0.1
  • beta: 0.4
  • lambda1: 600
  • momentum: 0.9
  • Weight-decay: 1e-4

ModuleNotFoundError: No module named 'keras.applications.resnet50'

I run PENCL .py

File "PENCIL.py", line 18, in
import resnet
File "/root/.local/lib/python3.6/site-packages/resnet/init.py", line 1, in
from keras.applications.resnet50 import ResNet50
ModuleNotFoundError: No module named 'keras.applications.resnet50'
so, is this code missing a classifier? How do you deal with this problem?
my keras==2.6.0 keras-resnet==0.2.0 tensorflow==2.6.0

No module named 'models'

HI,
when I run PENCIL.py, there was an import error said no module named 'models',
look forward to see your complete code.

tabular data/ noisy instances/ new datasets

Hi,
thanks for sharing your implementation. I have some questions about it:

  1. Does it also work on tabular data?
  2. Is the code tailored to the datasets used in the paper or can one apply it to any data?
  3. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.