pumpikano / tf-dann Goto Github PK

View Code? Open in Web Editor NEW

628.0 16.0 224.0 1.28 MB

Domain-Adversarial Neural Network in Tensorflow

License: MIT License

Jupyter Notebook 98.71% Python 1.29%

domain-adaptation tensorflow-models adversarial-learning

tf-dann's People

Contributors

Stargazers

Watchers

Forkers

ml-lab jeffzhengye jiaolong bentanust qingsong99 zhyuxie ltoscano izzeddingur vyraun ksaito-ut nour-mws lamhocn benjamesbabala sungjinlees jaejun-yoo bgshin allensmile robustfengbin aditay shaoli-huang jakc4103 iamyoungjo yzou2 codeaudit koosyong hyoungwoopark xiongshufeng iamgroot42 sjtucsly tnilanon dapeng2018 arunreddy joostvdoorn goodha1 sebaleh alex-lew seindlut amenegola felixwzh cantren cytms cenricop engineero ajaytalati zhangweichen2006 skyhowie25 willdamon jindongwang timerstime lijiazhen1994 dougmcilwraith pyzhangbit owen198 adedzy digits88 yangp725 wj-zhang gnperdue shubhampachori12110095 ihsan149 fangtongen jimmyxiaodong paris18e c-002 cosecant-csc haozhou2018 mainakbhattacharya nkmeng jameskoo hallochen ihaeyong jizhongyi111 stefanxinhong ghaddarabs shlpu chriszhenghaochen airyym lbnphoenix coolshan008 tgiser wogong mlyxs woo1 by2101 yuanmengzhixing frankblood yeonsuyam arasharn yunji-unity iamsile kewei0323 chenglongchen fendaq bingbao doublepg zhuandj ylexx auserj boshra kinredon

tf-dann's Issues

Logits and label size error

Hi Pumpikano,

With your help i was able to run the code on MNIST dataset. i am trying to train the model on another dataset. i have tried to modified the code accordingly, i will be very thankful if you can help me to solve this issue.
The code is

matplotlib inline

import tensorflow as tf
import numpy as np
import cPickle as pkl
from sklearn.manifold import TSNE

from flip_gradient import flip_gradient
from utils import *
from numpy import genfromtxt
import pdb

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Process MNIST

mnist_train = (mnist.train.images > 0).reshape(55000, 28, 28, 1).astype(np.uint8) * 255

print(mnist_train.shape)

mnist_train = np.concatenate([mnist_train, mnist_train, mnist_train], 3)

mnist_test = (mnist.test.images > 0).reshape(10000, 28, 28, 1).astype(np.uint8) * 255

mnist_test = np.concatenate([mnist_test, mnist_test, mnist_test], 3)

mnist_train = genfromtxt('iv_train.csv', delimiter=',')
mnist_test = genfromtxt('iv_train.csv', delimiter=',')

mnist_train_labels = genfromtxt('iv_train_label.csv', delimiter=',')
mnist_test_labels = genfromtxt('iv_train_label.csv', delimiter=',')

mnist_train = mnist_train.reshape(5119, 20, 20, 1)
mnist_train = np.concatenate([mnist_train, mnist_train, mnist_train], 3)
mnist_test = mnist_test.reshape(5119, 20, 20, 1)
mnist_test = np.concatenate([mnist_test, mnist_test, mnist_test], 3)
print 'Source train data size',mnist_train.shape
print 'Source test data size',mnist_train.shape

pdb.set_trace()

Load MNIST-M

mnistm = pkl.load(open('mnistm_data.pkl'))

mnistm_train = mnistm['train']

mnistm_test = mnistm['test']

mnistm_valid = mnistm['valid']

mnistm_train = genfromtxt('iv_test_half.csv', delimiter=',')
mnistm_test = genfromtxt('iv_test_half.csv', delimiter=',')
mnistm_valid = genfromtxt('iv_test_half.csv', delimiter=',')

mnistm_train = mnistm_train.reshape(9000, 20, 20, 1)
mnistm_train = np.concatenate([mnistm_train, mnistm_train, mnistm_train], 3)
mnistm_test = mnistm_test.reshape(9000, 20, 20, 1)
mnistm_test = np.concatenate([mnistm_test, mnistm_test, mnistm_test], 3)
mnistm_valid = mnistm_valid.reshape(9000, 20, 20, 1)

mnistm_valid = np.concatenate([mnistm_valid, mnistm_valid, mnistm_valid], 3)

print 'Source train data size',mnistm_train.shape
print 'Source test data size',mnistm_test.shape
print 'Source test data size',mnistm_valid.shape

pdb.set_trace()

Compute pixel mean for normalizing data

pixel_mean = np.vstack([mnist_train, mnistm_train]).mean((0, 1, 2))

Create a mixed dataset for TSNE visualization

num_test = 500
combined_test_imgs = np.vstack([mnist_test[:num_test], mnistm_test[:num_test]])
combined_test_labels = np.vstack([mnist_test_labels[:num_test], mnist_test_labels[:num_test]])
combined_test_domain = np.vstack([np.tile([1., 0.], [num_test, 1]),
np.tile([0., 1.], [num_test, 1])])

imshow_grid(mnist_train)

imshow_grid(mnistm_train)

batch_size = 12

class MNISTModel(object):
"""Simple MNIST domain adaptation model."""
def init(self):
self._build_model()

def _build_model(self):

    self.X = tf.placeholder(tf.uint8, [None, 20, 20, 3])
    self.y = tf.placeholder(tf.float32, [None, 20])
    self.domain = tf.placeholder(tf.float32, [None, 2])
    self.l = tf.placeholder(tf.float32, [])
    self.train = tf.placeholder(tf.bool, [])

    X_input = tf.cast(self.X, tf.float32) - pixel_mean

    # CNN model for feature extraction
    with tf.variable_scope('feature_extractor'):

        W_conv0 = weight_variable([5, 5, 3, 32])
        b_conv0 = bias_variable([32])
        h_conv0 = tf.nn.relu(conv2d(X_input, W_conv0) + b_conv0)
        h_pool0 = max_pool_2x2(h_conv0)

        W_conv1 = weight_variable([5, 5, 32, 48])
        b_conv1 = bias_variable([48])
        h_conv1 = tf.nn.relu(conv2d(h_pool0, W_conv1) + b_conv1)
        h_pool1 = max_pool_2x2(h_conv1)

        # The domain-invariant feature
        self.feature = tf.reshape(h_pool1, [-1, 4*8*75])

    # MLP for class prediction
    with tf.variable_scope('label_predictor'):

        # Switches to route target examples (second half of batch) differently
        # depending on train or test mode.
        all_features = lambda: self.feature
        source_features = lambda: tf.slice(self.feature, [0, 0], [batch_size, -1])
        classify_feats = tf.cond(self.train, source_features, all_features)

        all_labels = lambda: self.y
        source_labels = lambda: tf.slice(self.y, [0, 0], [batch_size, -1])
        self.classify_labels = tf.cond(self.train, source_labels, all_labels)

        W_fc0 = weight_variable([4 * 8 * 75, 100])
        b_fc0 = bias_variable([100])
        h_fc0 = tf.nn.relu(tf.matmul(classify_feats, W_fc0) + b_fc0)

        W_fc1 = weight_variable([100, 100])
        b_fc1 = bias_variable([100])
        h_fc1 = tf.nn.relu(tf.matmul(h_fc0, W_fc1) + b_fc1)

        W_fc2 = weight_variable([100, 20])
        b_fc2 = bias_variable([20])
        #logits = tf.placeholder(tf.float32, [12, 20])
        logits = tf.matmul(h_fc1, W_fc2) + b_fc2
        print 'logit_size', logits.get_shape()

        self.pred = tf.nn.softmax(logits)
        self.pred_loss = tf.nn.softmax_cross_entropy_with_logits(logits, self.classify_labels)


    # Small MLP for domain prediction with adversarial loss
    with tf.variable_scope('domain_predictor'):

        # Flip the gradient when backpropagating through this operation
        feat = flip_gradient(self.feature, self.l)

        d_W_fc0 = weight_variable([4*8*75, 100])
        d_b_fc0 = bias_variable([100])
        d_h_fc0 = tf.nn.relu(tf.matmul(feat, d_W_fc0) + d_b_fc0)

        d_W_fc1 = weight_variable([100, 2])
        d_b_fc1 = bias_variable([2])
        d_logits = tf.matmul(d_h_fc0, d_W_fc1) + d_b_fc1

        self.domain_pred = tf.nn.softmax(d_logits)
        self.domain_loss = tf.nn.softmax_cross_entropy_with_logits(d_logits, self.domain)

Build the model graph

graph = tf.get_default_graph()
with graph.as_default():
model = MNISTModel()

learning_rate = tf.placeholder(tf.float32, [])

pred_loss = tf.reduce_mean(model.pred_loss)
domain_loss = tf.reduce_mean(model.domain_loss)
total_loss = pred_loss + domain_loss

regular_train_op = tf.train.MomentumOptimizer(learning_rate, 0.9).minimize(pred_loss)
dann_train_op = tf.train.MomentumOptimizer(learning_rate, 0.9).minimize(total_loss)

# Evaluation
correct_label_pred = tf.equal(tf.argmax(model.classify_labels, 1), tf.argmax(model.pred, 1))
label_acc = tf.reduce_mean(tf.cast(correct_label_pred, tf.float32))
correct_domain_pred = tf.equal(tf.argmax(model.domain, 1), tf.argmax(model.domain_pred, 1))
domain_acc = tf.reduce_mean(tf.cast(correct_domain_pred, tf.float32))

Params

num_steps = 8600

def train_and_evaluate(training_mode, graph, model, verbose=False):
"""Helper to run the model with different training modes."""

with tf.Session(graph=graph) as sess:
    tf.initialize_all_variables().run()

    # Batch generators
    gen_source_batch = batch_generator(
        [mnist_train, mnist_train_labels], batch_size)
    gen_target_batch = batch_generator(
        [mnistm_train, mnist_train_labels], batch_size)
    gen_source_only_batch = batch_generator(
        [mnist_train, mnist_train_labels], batch_size)
    gen_target_only_batch = batch_generator(
        [mnistm_train, mnist_train_labels], batch_size)

    domain_labels = np.vstack([np.tile([1., 0.], [batch_size, 1]),
                               np.tile([0., 1.], [batch_size, 1])])

    # Training loop
    for i in range(num_steps):

        # Adaptation param and learning rate schedule as described in the paper
        p = float(i) / num_steps
        l = 2. / (1. + np.exp(-10. * p)) - 1
        lr = 0.01 / (1. + 10 * p)**0.75


        # Training step
        if training_mode == 'dann':
            #pdb.set_trace()
            X0, y0 = gen_source_batch.next()
            pdb.set_trace()
            X1, y1 = gen_target_batch.next()

            X = np.vstack([X0, X1])
            y = np.vstack([y0, y1])


            _, batch_loss, dloss, ploss, d_acc, p_acc = \
                sess.run([dann_train_op, total_loss, domain_loss, pred_loss, domain_acc, label_acc],
                         feed_dict={model.X: X, model.y: y, model.domain: domain_labels,
                                    model.train: True, model.l: l, learning_rate: lr})

            if verbose and i % 100 == 0:
                print 'loss: %f  d_acc: %f  p_acc: %f  p: %f  l: %f  lr: %f' % \
                        (batch_loss, d_acc, p_acc, p, l, lr)

        elif training_mode == 'source':

            X, y = gen_source_only_batch.next()
            print 'label_size', y.shape
            pdb.set_trace()


            _, batch_loss = sess.run([regular_train_op, pred_loss],
                                 feed_dict={model.X: X, model.y: y, model.train: False,
                                            model.l: l, learning_rate: lr})


        elif training_mode == 'target':
            X, y = gen_target_only_batch.next()
            _, batch_loss = sess.run([regular_train_op, pred_loss],
                                 feed_dict={model.X: X, model.y: y, model.train: False,
                                            model.l: l, learning_rate: lr})

    # Compute final evaluation on test data
    source_acc = sess.run(label_acc,
                        feed_dict={model.X: mnist_test, model.y: mnist_test_labels,
                                   model.train: False})

    target_acc = sess.run(label_acc,
                        feed_dict={model.X: mnistm_test, model.y: mnist_test_labels,
                                   model.train: False})

    test_domain_acc = sess.run(domain_acc,
                        feed_dict={model.X: combined_test_imgs,
                                   model.domain: combined_test_domain, model.l: 1.0})

    test_emb = sess.run(model.feature, feed_dict={model.X: combined_test_imgs})

return source_acc, target_acc, test_domain_acc, test_emb

print '\nSource only training'
source_acc, target_acc, _, source_only_emb = train_and_evaluate('source', graph, model)
print 'Source (MNIST) accuracy:', source_acc
print 'Target (MNIST-M) accuracy:', target_acc
pdb.set_trace()

print '\nDomain adaptation training'
source_acc, target_acc, d_acc, dann_emb = train_and_evaluate('dann', graph, model)
print 'Source (MNIST) accuracy:', source_acc
print 'Target (MNIST-M) accuracy:', target_acc
print 'Domain accuracy:', d_acc

tsne = TSNE(perplexity=30, n_components=2, init='pca', n_iter=3000)
source_only_tsne = tsne.fit_transform(source_only_emb)

tsne = TSNE(perplexity=30, n_components=2, init='pca', n_iter=3000)
dann_tsne = tsne.fit_transform(dann_emb)

plot_embedding(source_only_tsne, combined_test_labels.argmax(1), combined_test_domain.argmax(1), 'Source only')
plot_embedding(dann_tsne, combined_test_labels.argmax(1), combined_test_domain.argmax(1), 'Domain Adaptation')

The error i am getting is:

Source train data size (5119, 20, 20, 3)
Source test data size (5119, 20, 20, 3)
Source train data size (9000, 20, 20, 3)
Source test data size (9000, 20, 20, 3)
Source test data size (9000, 20, 20, 1)
logit_size (?, 20)
logit_size (?, 20)
logit_size (?, 20)

Source only training
label_size (12, 20)

/usr/local/lib/python3.4/dist-packages/tensorflow/examples/tutorials/tf-dann-master/tf-DANN-LID-code_test.py(235)train_and_evaluate()
-> _, batch_loss = sess.run([regular_train_op, pred_loss],
(Pdb) c
Traceback (most recent call last):
File "tf-DANN-LID-code_test.py", line 265, in
print 'Source (MNIST) accuracy:', source_acc
File "tf-DANN-LID-code_test.py", line 235, in train_and_evaluate
feed_dict={model.X: X, model.y: y, model.train: False,
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 710, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 908, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 958, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 978, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: logits and labels must be same size: logits_size=[6,20] labels_size=[12,20]
[[Node: label_predictor/SoftmaxCrossEntropyWithLogits = SoftmaxCrossEntropyWithLogits[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](label_predictor/add_2, label_predictor/cond_1/Merge)]]
Caused by op u'label_predictor/SoftmaxCrossEntropyWithLogits', defined at:
File "tf-DANN-LID-code_test.py", line 160, in
File "tf-DANN-LID-code_test.py", line 80, in init
self._build_model()
File "tf-DANN-LID-code_test.py", line 137, in _build_model
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 491, in softmax_cross_entropy_with_logits
precise_logits, labels, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 1427, in _softmax_cross_entropy_with_logits
features=features, labels=labels, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2317, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1239, in init
self._traceback = _extract_stack()

The label size is (12,20) but logit size is (?,20).. How to fix this issue? Many Thanks
Regards
Saad

Need explain about model

Hi, Thanks for you nice code!
Im trying to re-implement this codes to pytorch.
but I cant understand the models 'label_predictor' and 'domain_predictor

            W_fc0 = weight_variable([7 * 7 * 48, 100])
            b_fc0 = bias_variable([100])
            h_fc0 = tf.nn.relu(tf.matmul(classify_feats, W_fc0) + b_fc0)

            W_fc1 = weight_variable([100, 100])
            b_fc1 = bias_variable([100])
            h_fc1 = tf.nn.relu(tf.matmul(h_fc0, W_fc1) + b_fc1)

            W_fc2 = weight_variable([100, 10])
            b_fc2 = bias_variable([10])
            logits = tf.matmul(h_fc1, W_fc2) + b_fc2

In label_predictor, It looks there is no conv or FC layer.
Can you explain about this network?

How to compute prediction and transformed features?

Hi Pumpikano,

thanks for answering my questions. i am trying to get the prediction values instead of predicted labels and accuracy, in this regard, i have modify the code as
Changed the line
correct_label_pred = tf.equal(tf.argmax(model.classify_labels, 1), tf.argmax(model.pred, 1))
to
label_acc_lk = model.pred

i am getting the predictions, just wondering is it correct to do so?

Many thanks

Regards
Saad

having trouble with convergence

Hi, first off thank you for the wonderful code. I am trying to replicate the toy blob example in pytorch. I am finding that it unreliably converges to the same accuracies that you report. Sometimes it will not converge at all, and other times it will get to the 97% source/97% target accuracy. Also, the source-only training yields a 50% accuracy on target domain. I was wondering if there were any snags you encountered that hindered convergence?

Thanks

Austin

how to set params?

how to set l from grad(x, l)? what it means when i is bigger?

total loss

Hi
First of all, thanks for implementing this. It's really awesome!
I'm currently modifying the codes to run it on OFFICE data set.
I found out that there is a small difference on the definition of total loss.

In the original paper, the total_loss is defined as :
predict_loss + lambda*domain_loss .

In the code, seems that the lambda term is missing.
I think that's one of the reasons that sometimes the total_loss would go crazy to NaN

How many epochs did you apply?

Thanks for you nice work.

Can I ask how many epochs did you apply on MNIST->MNIST-M experiment?

The question about loss function

In the paper, "While the parameters of the classifiers are optimized in order to minimize their error on the training set, the parameters of the underlying deep feature mapping are optimized in order to minimize the loss of the label classifier and to maximize the loss of the domain classifier. "
Why the loss functions are minimized? The loss function of the domain classifier should not be maximized?

Keras implementation?

Hey,

First of all thanks a lot for this. I was wondering whether there is an easy way to make the gradient flipping work in Keras. Someone has done it for the Theano backend, but not for the Tensorflow. Would it be feasible to combine the two?

Thanks!

Is the code in Blobs-DANN.ipynb suitable for Cross-domain sentiment classification work

I am trying to implement DANN for cross-domain sentiment classification (CDSC), which is also reported in the host paper. However, it seems that there is not a specific ipynb file to meet the CDSC task. I wonder whether the DANN structure of CDSC is same with Blobs-DANN? Many thanks.

Licence

Thank you for this nice project!

We (RETURNN) have made use of some code snippets from this project, esp the FlipGradientBuilder. (Or rather, after reading your code, I was not really sure how to write it in a different way.)

Is it possible that you add a licence to your code? Like Apache-2.0 or MIT?

rwth-i6/returnn#796

The code doesn't work at all

Hi there,
I've downloaded the code and dataset, but I couldn't run the ipynb file at all. It seems in every block, there is a mistake.
My platform is Windows 10 + python 3.6 + tensorflow 1.3.0.
I also check with my Mac OS 10.13.0 + python 3.6 + tensorflow 1.3.0.
Coule you please rerun and check the code yourself?

save the model

Hello,
Great implementation, thanks for sharing! I'd like to save the trained model. Do you have an idea of the simplest way to do this? I have tried with pickle but got an error:
TypeError: can't pickle _thread.lock objects

Best,
Eliott

please

hey,I have some questions.
firstly,in your code : total_loss=domain_loss + predict_loss.
but in the paper total_loss= domain_loss - lamda*predict_loss.
secondly,this is because GRL layer?

Adapted feature extraction

Hi Pumpikano,

With your help, i am able to get the prediction values, (thanks)..

but the performance is very poor. so i thought to look at the adapted features on my database. The features are all zeros.
source_acc, target_acc, d_acc, dann_emb = train_and_evaluate('dann', graph, model)

in the above line 'dann_emb' is the adapted features and they are all zeros. There is no error or warning in the code.

Can you please suggest where the problem might be? Thanks alot

Best Regards
Saad

Parameter selection

Hi Pumpikano,

Once again thanks for reply quickly to my previous posts. i have two quick questions:

1- how to select the parameters to optimize the performance? e.g. in your code (MNIST)
W_conv0 = weight_variable([5, 5, 3, 32])
b_conv0 = bias_variable([32])
h_conv0 = tf.nn.relu(conv2d(X_input, W_conv0) + b_conv0)
h_pool0 = max_pool_2x2(h_conv0)

        W_conv1 = weight_variable([5, 5, 32, 48])
        b_conv1 = bias_variable([48])
        h_conv1 = tf.nn.relu(conv2d(h_pool0, W_conv1) + b_conv1)
        h_pool1 = max_pool_2x2(h_conv1)

        # The domain-invariant feature
        self.feature = tf.reshape(h_pool1, [-1, 7*7*48])

why did you select 7_7_48 to reshape? it should be factors of 2352, why?

2- TSNE is working properly in my ubuntu.. but when i run MNIST code, it didn't plot and not even give any error or warning. Please help if you know where to look in.

Many Thanks

Best Regards
Saad

How to train if there are multiple sources?

Suppose we have 3 sources domain and 1 target domain. How should I train DANN in this case? Should I combine all the 3 sources into only 1 source or should I label those 3 domain separately during training?

Strange results after porting from Python2 to Python3

I have tried to fix the reason of errors discussed in this other issue and found the way to avoid using "one_hot()" function in tensorflow.
The code runs fine but the results are strange. As I have not touched your code except the three things ("one_hot()" function, division (/) -> (//), object.next() -> next(object)), I cannot find other reason to have those results. Could you please visit my github repository and check if the results are wrong and maybe figure out the reason...? You can see the results from three different amendments I tried.

Please help

Hi,

Thanks for sharing the code. I m new to tensorflow.. i have developed the MNIST database as u suggested in your code. Can you please help me to run the adaptation code? Which file is actually is the main code for tf-dann

Many thanks
Regards
Saad

May I open source my code on Multi-Soucrce Domain Adaptation?

Hi, Thank you so much for your code. I am interested in domain adaptation and have run your code. I also implemented several experiments of Multi-Source Domain Adaptation. As an apprentice in tensorflow, I borrowed several lines of your code. I wonder may I open source my code on Multi-Source Domain Adaptation? Thank you very much.

How to create / download synthetic digits dataset?

Hi,
I'm trying to reproduce some DA results using the synthetic digits dataset (not mnist-M one).
According to this paper:
https://vision.in.tum.de/_media/spezial/bib/haeusser_iccv_17.pdf
the synthetic digits -> MNIST experiment run on a synthetic digits dataset provided by you.
The same dataset seems to be used by you in your "Domain-Adversarial Training of Neural Networks" work, and this synth dataset is mentioned in this github repo README:
"The MNIST-DANN.ipynb recreates the MNIST experiment from the papers on a synthetic dataset"
But it seems I cannot find proper code for generating it or download it ( I would like to get the very same dataset for fair comparison).
I thank you in advance!

DANN for Regression

Hi,
I have changed this project in order to try domain adversarial learning in a regression problem. actually, I replaced the label predictor with a regressor. But it just makes the regression accuracy worse. Do you have any idea why it happens?

Improving results for MNIST-M?

I was able to reproduce the main result on the mnist-m dataset (basically~ 0.5 acc -> ~0.7). What are architectural strategies available to improve it even further?

mnistm labels

Hi,

Thanks for your code. I found that you use the mnist labels with mnistm dataset even when you are creating the target only batch. Can you explain why is that??

Thanks

Theoretical question - Domain accuarcy

Hello,

Thanks for sharing this code! I have couple of question, more theoretical then practical :)

When I run MNIST example for 86000 steps (10 times more then original), I got very poor results (around 20% acc on source and target sets), so I am wondering what is a problem. I run couple of times more and got ok results, but I am wondering what leads to poor results in some executions.

First, I am not sure what should I expect for domain accuracy - should it be around 50% or it is not important. I think that is better to look at domain predictions, because I should expect probabilities around 50% (domain discriminator is not sure what is source what is target example), so maybe accuracy itself is not important. I think that problem with poor results become in situation where domain discriminator predicts only one class with probabilities 1 (loss is very huge so in that case it ruins feature extractor and classification accuracy).

Another thing which confuses me is could DANN gets good results even if domain accuracy is 1. So, domain discriminator can decide which is source which is target example (with huge probabilities), but results on target data are good.

Generally, I am wondering how can I track if my DANN works ok, which metrics to use.

I hope that you can help me, maybe I totally wrong understand concept :)

Thanks a lot!

Does ordering matter?

Does the relative ordering of training examples matter between source and target examples? I am trying to port this to another DL framework (which has its own inbuilt MNIST dataset).

batch_generator doesn't have next()

Hello,

I am running DANN but there's a problem like this:
AttributeError: 'generator' object has no attribute 'next'
(more specifically, batch_generator in util.py)

If I am doing something wrong, let me know :)

feeding domain value to the model

I assume you want to transfer the label of the mnist image to the mnistm images given that you have only the label of mnist data. So when you create the generator you used,

    gen_target_batch = batch_generator(
        [mnistm_train, mnist.train.labels], batch_size / 2)

here you feed label data for the target domain. Why? Isn't it we are supposed to learn? Why are we feeding labeled data for target domain? Isn't it supposed to be empty? Please pardon me if I made any mistake while skimming through your code.

simplify flip_gradient

Since tensorflow 1.7 there is a new way to redefine the gradient.

@tf.custom_gradient
def flip_grad_layer(x, l):
    def grad(dy):
        return tf.negative(dy) * l, None
    return tf.identity(x), grad

pumpikano / tf-dann Goto Github PK

tf-dann's People

Contributors

Stargazers

Watchers

Forkers

tf-dann's Issues

matplotlib inline

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Process MNIST

mnist_train = (mnist.train.images > 0).reshape(55000, 28, 28, 1).astype(np.uint8) * 255

print(mnist_train.shape)

mnist_train = np.concatenate([mnist_train, mnist_train, mnist_train], 3)

mnist_test = (mnist.test.images > 0).reshape(10000, 28, 28, 1).astype(np.uint8) * 255

mnist_test = np.concatenate([mnist_test, mnist_test, mnist_test], 3)

pdb.set_trace()

Load MNIST-M

mnistm = pkl.load(open('mnistm_data.pkl'))

mnistm_train = mnistm['train']

mnistm_test = mnistm['test']

mnistm_valid = mnistm['valid']

mnistm_valid = np.concatenate([mnistm_valid, mnistm_valid, mnistm_valid], 3)

pdb.set_trace()

Compute pixel mean for normalizing data

Create a mixed dataset for TSNE visualization

imshow_grid(mnist_train)

imshow_grid(mnistm_train)

Build the model graph

Params

Recommend Projects

Recommend Topics

Recommend Org