Giter Site home page Giter Site logo

l_dmi's People

Contributors

newbeeer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

l_dmi's Issues

Confusion Matrix

Hi,

Your code looks good! But why do you set confusion matrix rather simple? I don't think your noisy data is really noisy. Have you tried more sophisticated cases?

conf_matrix = torch.eye(10)

new datasets/ noisy instances

Hi,
thanks for sharing your implementation. I have two questions about it:

  1. Is the code tailored to the datasets used in the paper or can one apply it to any data?
  2. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

clothing1M

   Can you provide the following files: noisy_train.txt,clean_val.txt,clean_test.txt.
    if train==True:
        flist = os.path.join(root, "annotations/noisy_train.txt")
    if valid==True:
        flist = os.path.join(root, "annotations/clean_val.txt")
    if test==True:
        flist = os.path.join(root, "annotations/clean_test.txt")

Negative Loss Values and a Strange Issue

I have written the following code for a toy dataset. It's the banana dataset. I am observing very strange issues:

  • Occasionally the loss values are negative. They are small in magnitude but negative nonetheless
  • If I run the code for, say 10 times, train accuracy converges to 70-80% and 10-20% for alternate runs (under 0% label noise)

`from future import print_function, absolute_import

"""
The code uses the following version:

Anaconda: 4.5.8 [64-bit]
python: 3.6.1
keras: 2.0.4
tensorflow-gpu: 1.2.0
numpy: 1.12.1 (/1.14.2)
matplotlib: 2.0.2
scipy: 1.1.0

"""

import numpy as np
from numpy.random import RandomState
import arff
import matplotlib
import matplotlib.pyplot as plt
import os
import pickle
import time
import argparse
from sklearn import model_selection
from numpy.testing import assert_array_almost_equal

import keras
import keras.backend as K
from keras.models import Sequential, load_model
from keras.layers import Dense, Dropout, Activation
from keras.callbacks import ModelCheckpoint
from keras import metrics
from keras.utils import plot_model

import tensorflow as tf

def L_DMI(y_true, y_pred):

U = (1/tf.dtypes.cast(tf.keras.backend.shape(y_true)[0], tf.float32))*tf.keras.backend.dot(tf.transpose(y_pred), y_true)
return -1.0 * tf.math.log(tf.dtypes.cast(tf.math.abs(tf.linalg.det(U)), tf.float32) + 1e-3)

def build_uniform_P(size, noise):
""" The noise matrix flips any class to any other with probability
noise / (#class - 1).
"""

assert(noise >= 0.) and (noise <= 1.)

P = noise / (size - 1) * np.ones((size, size))
np.fill_diagonal(P, (1 - noise) * np.ones(size))

assert_array_almost_equal(P.sum(axis=1), 1, 1)
return P

def multiclass_noisify(y, P, random_state=0):
""" Flip classes according to transition probability matrix P.
It expects a number between 0 and the number of classes - 1.
"""

assert P.shape[0] == P.shape[1]
assert np.max(y) < P.shape[0]

# row stochastic matrix
assert_array_almost_equal(P.sum(axis=1), np.ones(P.shape[1]))
assert (P >= 0.0).all()

m = y.shape[0]
new_y = y.copy()
flipper = np.random.RandomState(random_state)

for idx in np.arange(m):
    i = y[idx]
    # draw a vector with only an 1
    flipped = flipper.multinomial(1, P[int(i), :], size=1)[0]
    new_y[idx] = np.where(flipped == 1)[0]

return new_y

def noisify_with_P(labels, num_classes, noise, random_state=None):

if noise > 0.0:
    P = build_uniform_P(num_classes, noise)
    # seed the random numbers with #run
    labels_noisy = multiclass_noisify(labels, P=P, random_state=random_state)
    actual_noise = (labels_noisy != labels).mean()
    assert actual_noise > 0.0
    print('Actual noise %.2f' % actual_noise)
    labels = labels_noisy
else:
    P = np.eye(num_classes)

return labels, P

label_card = 1 #binary classification
num_classes = 2
seed = 23423

loss_fn = 'L_DMI'
#loss_fn = 'categorical_crossentropy'
batch_size = 256
epochs = 200

noise_rate = [0, 0.2, 0.4, 0.7]
noise_type = ['sym']

"""
Data preparation
"""

data_file = arff.load(open('./banana.arff', 'r'))
data_raw = data_file['data']
data_arr = np.asarray(data_raw, 'float')

data, data_lab = data_arr[:,:-label_card], data_arr[:,-label_card:]
no_samples = data.shape[0]

''' labels: {1,2} --> {0, 1} '''
index = np.where(data_lab == 1)
for i in index[0]:
data_lab[i] = 0

index = np.where(data_lab == 2)
for i in index[0]:
data_lab[i] = 1

for nt in noise_type:
for nr in noise_rate:

    X_temp, X_test, y_temp, y_test = model_selection.train_test_split(data,
     data_lab, test_size = 0.2, random_state = 42)

    if nr > 0:
        ''' add noise '''
        y_temp_noisy, P =  noisify_with_P(y_temp, num_classes=num_classes,
        noise=nr, random_state=seed)

        ''' random shuffle '''
        idx_perm = np.random.permutation(X_temp.shape[0])
        X_temp, y_temp_noisy = X_temp[idx_perm], y_temp_noisy[idx_perm]
    else:
        ''' random shuffle '''
        idx_perm = np.random.permutation(X_temp.shape[0])
        X_temp, y_temp_noisy = X_temp[idx_perm], y_temp[idx_perm]


    ''' train and val split '''
    X_train, X_val, y_train, y_val = model_selection.train_test_split(X_temp,
     y_temp_noisy, test_size = 0.2, random_state = 42)

    ''' normalize data '''
    # means = X_train.mean(axis=0)
    # std = np.std(X_train)
    # X_train = (X_train - means)/std
    # X_val = (X_val - means)/std
    # X_test = (X_test - means)/std

    ''' one-hot encoding '''
    y_train = keras.utils.to_categorical(y_train, num_classes=num_classes)
    y_val = keras.utils.to_categorical(y_val, num_classes=num_classes)
    y_test = keras.utils.to_categorical(y_test, num_classes=num_classes)

    """
    Model specifications
    """

    model = Sequential()
    model.add(Dense(64, kernel_initializer='glorot_uniform', activation='relu', input_shape=(X_train.shape[1],)))
    model.add(Dropout(0.2))
    model.add(Dense(16, kernel_initializer='glorot_uniform', activation='relu'))
    model.add(Dropout(0.5))
    #model.add(Dense(1, activation= 'sigmoid'))
    model.add(Dense(num_classes, kernel_initializer='glorot_uniform', activation='softmax'))

    #opt = keras.optimizers.SGD(lr=1e-5, momentum=1, decay=1e-4)
    #opt = keras.optimizers.RMSprop(lr=1e-3)
    opt = keras.optimizers.Adam(lr=3e-4, beta_1=0.9, beta_2=0.999)

    if loss_fn == 'L_DMI':
        loss = L_DMI
    else:
        loss = 'categorical_crossentropy'

    model.compile(optimizer=opt,loss=loss, metrics=['accuracy', loss])
    model.summary()

    """
    Callbacks
    """
    callbacks = []

    chkpt_filename = "model/checkpoint_banana_%s_%s_%s.hd5" % (loss_fn, nt, str(nr))
    checkpoint_callback = ModelCheckpoint(chkpt_filename, monitor='val_loss', verbose=1, save_best_only=True,
    save_weights_only=True, period=epochs)

    callbacks.append(checkpoint_callback)

    early_stop = keras.callbacks.EarlyStopping(monitor='val_loss',
    min_delta=0.001, patience=20, verbose=1, mode='auto')

    #callbacks.append(early_stop)

    reduce_lr_plat = keras.callbacks.ReduceLROnPlateau(monitor='val_loss',
    factor=0.1, patience=10, verbose=1, mode='auto', epsilon=0.0001,
    cooldown=0, min_lr=0.00001)

    callbacks.append(reduce_lr_plat)

    """
    Training
    """
    history = model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs,
    validation_data=(X_val, y_val), shuffle=False, verbose=1, callbacks=callbacks)

    mdl_filename = "model/model_banana_%s_%s_%s.hd5" % (loss_fn, nt, str(nr))
    model.save(mdl_filename)
    print('Saved trained model at %s ' % (mdl_filename))


    """
    Testing
    """
    if loss_fn == 'categorical_crossentropy':
        model_load = keras.models.load_model(mdl_filename)
    else:
        model_load = keras.models.load_model(mdl_filename,
        custom_objects={loss_fn:loss})

    pred_prob = model_load.predict(X_test)
    pred_lab = pred_prob.copy()
    pred_lab[pred_prob >= 0.5] = 1
    pred_lab[pred_prob < 0.5] = 0

    score = model_load.evaluate(X_test, y_test, batch_size=batch_size)
    print("==========================================================")
    print("==========================================================")
    print("                                                          ")
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])
    print("                                                          ")
    print("==========================================================")
    print("==========================================================")

    """
    Store results
    """
    res_filename = 'results/results_banana_%s_%s_%s.pkl' % (loss_fn, nt, str(nr))

    res_file = open(res_filename, 'wb')
    pickle.dump([pred_prob, score], res_file)

    res_file.close()
    print('Saved model results at %s ' % (res_filename))


    """
    Plot a graph of the model
    """
    #plot_model(model, to_file='model.png')

    """
    Plot results
    """
    # res_filename = 'results/results_banana_%s_%s_%s.pkl' % (loss_fn, nt, str(nr))
    #
    # res_file = open(res_filename, 'rb')
    # tmp_dat = pickle.load(res_file)

    X_zero_test = []
    Y_zero_test = []

    X_one_test = []
    Y_one_test = []

    X_zero_pred = []
    Y_zero_pred = []

    X_one_pred = []
    Y_one_pred = []

    for i in range(len(y_test)):
        if y_test[i][1] != 0:
            X_one_test.append(X_test[i][0])
            Y_one_test.append(X_test[i][1])
        else:
            X_zero_test.append(X_test[i][0])
            Y_zero_test.append(X_test[i][1])

        if pred_lab[i][1] != 0:
            X_one_pred.append(X_test[i][0])
            Y_one_pred.append(X_test[i][1])
        else:
            X_zero_pred.append(X_test[i][0])
            Y_zero_pred.append(X_test[i][1])


    plt.figure(2)
    plt.scatter(X_zero_test, Y_zero_test, c= 'b', label = 'class-0')
    plt.scatter(X_one_test, Y_one_test, c= 'r', label = 'class-1')
    plt.title('clean data distribution')
    plt.show()

    plt.figure(3)
    plt.scatter(X_zero_pred, Y_zero_pred, c= 'b', label = 'class-0')
    plt.scatter(X_one_pred, Y_one_pred, c= 'r', label = 'class-1')
    plt.title('prediction data (%s percent noise)' % (str(nr)))
    plt.show()

    plt.figure(4)
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('Model accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Validation'], loc='lower right')
    plt.show()

    plt.figure(5)
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('Model loss')
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Validation'], loc='lower right')
    plt.show()

`

DMI_loss

Hello, I am trying to implement DMI_loss, where can I get the data set of dogcat and clothing?

Thank you!

Negative loss values

Hello!

Is it normal to have negative values in the LDMI loss? If so, do you have any intuition on how this affects backpropagation when the loss values are around zero? I am encountering some instabilities in such cases (occurring for 60% of uniform random noise in CIFAR-10, when 60% means that 60% of the labels are incorrect, not just random).

Thanks in advance.
Best,
Eric

Help on running L_DMI

Hi!

I would like to run L-DMI in CIFAR-10 and CIFAR-100. I would appreciated a bit of help to run it properly.
As far as I understood from the paper/code, you pre-train with cross-entropy and then you apply the DMI loss. Am I right? Also, how many epochs do I need to run with cross-entropy? Which learning rate should I use? In my experience, when training with cross-entropy a high learning rate is desirable to prevent (to some extent) fitting the label noise. Furthermore, when applying the DMI loss, I can train with 0.00001 learning rate as suggested in the paper, right?
Many thanks in advance!

Best,
Diego.

Error in fashion.py

When i'm running main.py it is throwing error in fashion.py while defining the conf_matrix.
Please help me out of how to run your code in steps.

It seems that this new loss can also be used for clean data?

A wonderful work! It seems that this new loss can also be used for clean data? And different from cross entropy loss, which just calculates loss for each example separately, this loss consider the class distribution information between examples. Maybe it can achieve better performance. Have you tried it ? @CSPSY @Newbeeer

Clothing1M数据集下载

你好,我希望能尝试Clothing1M数据集,请问一下我在哪里可以下载该数据集吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.