yikun2019 / pencil Goto Github PK

View Code? Open in Web Editor NEW

140.0 3.0 20.0 619 KB

PyTorch implementation of Probabilistic End-to-end Noise Correction for Learning with Noisy Labels, CVPR 2019.

Python 100.00%

pencil's Introduction

PENCIL.pytorch

PyTorch implementation of Probabilistic End-to-end Noise Correction for Learning with Noisy Labels, CVPR 2019.

Requirements:

python3.6
numpy
torch-0.4.1
torchvision-0.2.0

Usage

On CIFAR-10, we retained 10% of the CIFAR-10 training data as the validation set and modify the original correct labels to obtain different noisy label datasets.
So the validation set is part of data_batch_5, and both of them have 5000 samples
Add symmetric noise on CIFAR-10: python addnoise_SN.py
Add asymmetric noise on CIFAR-10: python addnoise_AN.py
PENCIL.py is used for both training a model on dataset with noisy labels and validating it

options

b: batch size
lr: initial learning rate of stage1
lr2: initial learning rate of stage3
alpha: the coefficient of Compatibility Loss
beta: the coefficient of Entropy Loss
lambda1: the value of lambda
stage1: number of epochs utill the end of stage1
stage2: number of epochs utill the end of stage2
epoch: number of total epochs to run
datanum: number of train dataset samples
classnum: number of train dataset classes

The framework of PENCIL

## The proportion of correct labels on CIFAR-10

The results on real-world dataset Clothing1M

#	method	Test Accuracy (%)
1	Cross Entropy Loss	68.94
2	Forward [1]	69.84
3	Tanaka et al. [2]	72.16
4	PENCIL	73.49

Citing this repository

If you find this code useful in your research, please consider citing us:

@inproceedings{PENCIL_CVPR_2019,
author = {Kun, Yi and Jianxin, Wu},
title = {{Probabilistic End-to-end Noise Correction for Learning with Noisy Labels}},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2019}
}

Reference

[1] Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. Making deep neural networks robust to label noise: A loss correction approach. In CVPR, pages 1944–1952, 2017.
[2] Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, and Kiyoharu Aizawa. Joint optimization framework for learning with noisy labels. In CVPR, pages 5552–5560, 2018.

pencil's People

Contributors

Stargazers

Watchers

Forkers

zhanzhibingshang moondaiy rumengyi haha-533 fzh1996 camelliahmy useric yuanwei0908 lilujunai pmorerio waqarahmed89 mikeswf mars-wei zongxingxie zylprivate wangzhenzhou2020 kongyanlei zhwl2117 trinhngocnhu61

pencil's Issues

The code does not utilize the KL divergence.

It seems that the code use crossentropy loss as the classification loss (Lc), and it does not use the KL divergence as represented in the paper?

Question about entropy loss

In the paper, entropy loss "helps avoid the training from being stalled in our PENCIL framework, because the label distribution is not going to be a one-hot distribution and then f(x;\theta) will be different from y^d"
In the code, entropy loss is same with the definition \Sigma -p * log p in PENCIL.py line 355.

However, when you minimize entropy loss, it "force the network to peak at only one category rather than being flat because the one-hot distribution has the smallest possible entropy value" as written in just a few phrase above.

So, I am confused if you want to minimize entropy loss or not as your intention is opposite to entropy minimization though your final loss function reduces entropy loss.
Is there anything I am wrong with?

Changing the number of classes

I am traying to use the PENCIL framework to train an custom model on a custom dataset, with 5 classes. When I run the model, I get the error
Traceback (most recent call last): File "PENCIL.py", line 512, in <module> main() File "PENCIL.py", line 299, in main train(trainloader, model, criterion, optimizer, epoch, y) File "PENCIL.py", line 353, in train new_y[index, :] = onehot ValueError: shape mismatch: value array of shape (16,10) could not be broadcast to indexing result of shape (16,5)

The creation of the onehot tensor is given in line 351:
onehot = torch.zeros(target.size(0), 10).scatter_(1, target.view(-1, 1), 10.0)

Should I change only the 1st 10 in the code, and keep the "10.0", or change that to "5.0". What exactly does this line do?

KeyError: 'preact_resnet32'

I run PENCL model on cifar-10 datasets. the "--arch" uses preact_resnet32 as backbone network.
when running.....:

File "PENCIL.py", line 221, in main
model = resnet.dictargs.arch
KeyError: 'preact_resnet32'

so, is this code missing a classifier? How do you deal with this problem？
my keras==2.6.0 keras-resnet==0.2.0 tensorflow==2.6.0

The KL divergence is error!

We run PENCL model on cifar-10 datasets. the "--arch" uses resnet-18 as backbone network.

when running.....:

File "PENCIL-master/PENCIL.py", line 355, in train
lc = torch.mean(softmax(output)*(logsoftmax(output)-torch.log((last_y_var))))
RuntimeError: The size of tensor a (1000) must match the size of tensor b (10) at non-singleton dimension 1

output size: (128,1000). last_y_var size: (128,10)
so, is this code missing a classifier?

Would PENCIL decrease the label accuracy?

Hi, I'm trying to re-implement your work.

I'm using ResNet-32 as backbone follow the paper's but without pretrained model parameters. Other hyper-parameters follow the paper suggession in Section 4.2 and listed below. The net was training on CIFAR-10 dataset with 10% asymmeric noise. However, experimental result was far from Paper's. Final Model best top1 accuracy is 84.02%. The accuracy of data grew very rapidly but soon began to decline and fell below the accuracy of the original labels.

Are there any tricks that didn't get written into the paper or that I ignored？Thanks.

Here is the plot of label accuracy changing

And hyper-parameter was shown as follows:

batch_size: 128
lr: 0.06
lr2: 0.2
alpha: 0.1
beta: 0.4
lambda1: 600
momentum: 0.9
Weight-decay: 1e-4

ModuleNotFoundError: No module named 'keras.applications.resnet50'

I run PENCL .py

File "PENCIL.py", line 18, in
import resnet
File "/root/.local/lib/python3.6/site-packages/resnet/init.py", line 1, in
from keras.applications.resnet50 import ResNet50
ModuleNotFoundError: No module named 'keras.applications.resnet50'
so, is this code missing a classifier? How do you deal with this problem？
my keras==2.6.0 keras-resnet==0.2.0 tensorflow==2.6.0

No module named 'models'

HI,
when I run PENCIL.py, there was an import error said no module named 'models',
look forward to see your complete code.

tabular data/ noisy instances/ new datasets

Hi,
thanks for sharing your implementation. I have some questions about it:

Does it also work on tabular data?
Is the code tailored to the datasets used in the paper or can one apply it to any data?
Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!