Giter Site home page Giter Site logo

ttur's Introduction

Two time-scale update rule for training GANs

This repository contains code accompanying the paper GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium.

Fréchet Inception Distance (FID)

The FID is the performance measure used to evaluate the experiments in the paper. There, a detailed description can be found in the experiment section as well as in the the appendix in section A1.

In short: The Fréchet distance between two multivariate Gaussians X_1 ~ N(mu_1, C_1) and X_2 ~ N(mu_2, C_2) is

                   d^2 = ||mu_1 - mu_2||^2 + Tr(C_1 + C_2 - 2*sqrt(C_1*C_2)).

The FID is calculated by assuming that X_1 and X_2 are the activations of the coding layer pool_3 of the inception model (see below) for generated samples and real world samples respectivly. mu_n is the mean and C_n the covariance of the activations of the coding layer over all real world or generated samples.

IMPORTANT: The number of samples to calculate the Gaussian statistics (mean and covariance) should be greater than the dimension of the coding layer, here 2048 for the Inception pool 3 layer. Otherwise the covariance is not full rank resulting in complex numbers and nans by calculating the square root.

We recommend using a minimum sample size of 10,000 to calculate the FID otherwise the true FID of the generator is underestimated.

Compatibility notice

Previous versions of this repository contained two implementations to calculate the FID, a "unbatched" and a "batched" version. The "unbatched" version should not be used anymore. If you've downloaded this code previously, please update it immediately to the new version. The old version included a bug!

A pytorch implementation of the FID

If you're looking for a pytorch implementation we recommend https://github.com/mseitzer/pytorch-fid

Provided Code

Requirements: TF 1.1+, Python 3.x

fid.py

This file contains the implementation of all necessary functions to calculate the FID. It can be used either as a python module imported into your own code, or as a standalone script to calculate the FID between precalculated (training set) statistics and a directory full of images, or between two directories of images.

To compare directories with pre-calculated statistics (e.g. the ones from http://bioinf.jku.at/research/ttur/), use:

fid.py /path/to/images /path/to/precalculated_stats.npz

To compare two directories, use

fid.py /path/to/images /path/to/other_images

See fid.py --help for more details.

fid_example.py

Example code to show the usage of fid.py in your own Python scripts.

precalc_stats_example.py

Example code to show how to calculate and save training set statistics.

WGAN_GP

Improved WGAN (WGAN-GP) implementation forked from https://github.com/igul222/improved_wgan_training with added FID evaluation for the image model and switchable TTUR/orig settings. Lanuage model with JSD Tensorboard logging and switchable TTUR/orig settings.

Precalculated Statistics for FID calculation

Precalculated statistics for datasets

are provided at: http://bioinf.jku.at/research/ttur/

Additional Links

For FID evaluation download the Inception modelf from http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz

The cropped CelebA dataset can be downloaded here http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html

To download the LSUN bedroom dataset go to: http://www.yf.io/p/lsun

The 64x64 downsampled ImageNet training and validation datasets can be found here http://image-net.org/small/download.php

ttur's People

Contributors

3288103265 avatar hubira avatar mhamilton723 avatar mhex avatar pratjosh9 avatar pyrestone avatar untom avatar wendykan avatar zejianli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ttur's Issues

consulting

Hello martin!
The work is very great.But i want to consult whether FID can be used in medical image field? After all,Iception mode is trained on natural images.Thank you very much!

KeyError: "The name 'FID_Inception_Net/pool_3:0' refers to a Tensor which does not exist. The operation, 'FID_Inception_Net/pool_3', does not exist in the graph."

Hello, thanks for your wonderful work. I run in colab and I have met an error recently. but I don't know how to fix it.
Here is the whole message:

create inception graph.. ok
calculte FID stats..

KeyError Traceback (most recent call last)
in ()
6 with tf.compat.v1.Session() as Sess:
7 Sess.run(tf.compat.v1.global_variables_initializer())
----> 8 mu_gen, sigma_gen = fid.calculate_activation_statistics(images, Sess, batch_size=100)
9
10 fid.fid_value = calculate_frechet_distance(mu_gen, sigma_gen, mu_real, sigma_real)

5 frames
/content/TTUR/fid.py in calculate_activation_statistics(images, sess, batch_size, verbose)
178 the incption model.
179 """
--> 180 act = get_activations(images, sess, batch_size, verbose)
181 mu = np.mean(act, axis=0)
182 sigma = np.cov(act, rowvar=False)

/content/TTUR/fid.py in get_activations(images, sess, batch_size, verbose)
81 activations of the given tensor when feeding inception with the query tensor.
82 """
---> 83 inception_layer = _get_inception_layer(sess)
84 n_images = images.shape[0]
85 if batch_size > n_images:

/content/TTUR/fid.py in _get_inception_layer(sess)
47 """Prepares inception net for batched usage and returns pool_3 layer. """
48 layername = 'FID_Inception_Net/pool_3:0'
---> 49 pool3 = sess.graph.get_tensor_by_name(layername)
50 ops = pool3.graph.get_operations()
51 for op_idx, op in enumerate(ops):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in get_tensor_by_name(self, name)
3900 raise TypeError("Tensor names are strings (or similar), not %s." %
3901 type(name).name)
-> 3902 return self.as_graph_element(name, allow_tensor=True, allow_operation=False)
3903
3904 def _get_tensor_by_tf_output(self, tf_output):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in as_graph_element(self, obj, allow_tensor, allow_operation)
3724
3725 with self._lock:
-> 3726 return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
3727
3728 def _as_graph_element_locked(self, obj, allow_tensor, allow_operation):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in _as_graph_element_locked(self, obj, allow_tensor, allow_operation)
3766 raise KeyError("The name %s refers to a Tensor which does not "
3767 "exist. The operation, %s, does not exist in the "
-> 3768 "graph." % (repr(name), repr(op_name)))
3769 try:
3770 return op.outputs[out_n]

KeyError: "The name 'FID_Inception_Net/pool_3:0' refers to a Tensor which does not exist. The operation, 'FID_Inception_Net/pool_3', does not exist in the graph."

Large System memory requirements when running FID

Hi, when using your script to compute the FID between two folders of 10k images (as is recommended) the system memory requirement of the script is gigantic.

I have looked into the script and seen that you load all of the images into RAM before starting to evaluate the statistics. Is there a way to handle this in batched form or even create a running version of the statistics?

That way it might take a little longer to compute, but would definitely relax the memory requirements by a large amount, which in will speed up the script by reducing swap space needed.

Thanks in advanced.

P.S: If you say this can be done, but don't have the time to do it, I can try implementing it myself and issue a pull request, but I'd like to know from you wether that would affect the correctness of the algorithm.

FID for GAN trained on MNIST

Hi,
I'm training a DC GAN on MNIST dataset. I want to compute FID for my model. In this case should I use the Inception network itself or should I use a different classifier trained on MNIST dataset?

cannot find create_incpetion_graph()

It seems that there is no any create_incpetion_graph() definition in fid.py.
Also, the error message from fid_example.py runs as:

AttributeError: module 'fid' has no attribute 'create_incpetion_graph'

Have you met the same problem?

Invalid argument: activation input is not finite. : Tensor had NaN values

Original stack trace for 'FID_Inception_Net/mixed_5/tower_1/conv_1/CheckNumerics':

I use the version tensorflow 1.14.
This problem appears randomly, maybe in the early stage of training, maybe in the later stage, and it always appears during a certain FID evaluation.

I don't know the reason, but it doesn't seem to be because of the mode collapse, at least the generated sample looks good to the naked eye.Can you help me?

FID get "nan" or "complex number"

Hi,

I'm trying FID.

I get "nan" or "complex number", which stems from the "sp.linalg.sqrtm".

Have you ever faced similar issues? How should I solve this problem?

Thanks a lot.

Some times get Imaginary component

Hi, thanks for your wonderful work. I have met a value error recently. However, sometimes it happens, but if I rerun the code, it disappears. I wonder how can I solve it? But I am not sure why it happens.

Question regarding FID Score

Hello, I'm interested in using your code to evaluate several GAN models that I used as of now. Although, I have few question regarding your implementation of FID score

  1. Recently I've been digging on implementation of FID, and found that Google released an evaluation paper using your method (code). However, they used Inception V3 while you still used Inception V1 here. Which one should I use?
  2. From your code, I found that building the pretrained model using our data also need batch_size parameter. How does this affect the calculation? And which size is better for a given set of data?

ValueError: Cannot feed value of shape (50, 128, 128, 3) for Tensor 'FID_Inception_Net/ExpandDims:0', which has shape '(1, ?, ?, 3)

I met a error for 'tf.Tensor._shape cannot be assigned', so I change the
o._shape = tf.TensorShape(new_shape)
to
o.set_shape(tf.TensorShape(new_shape)) mentioned by another project's commit
Unfortunately, this work fine for above project, but this time I still get a
ValueError: Cannot feed value of shape (50, 128, 128, 3) for Tensor 'FID_Inception_Net/ExpandDims:0', which has shape '(1, ?, ?, 3).
I find if I print the tensor shape use get_shape() before and after the set_shape call in function _get_inception_layer, the Tensor shape don't change. It seems the set_shape don't work.
Does anyone have any ideas? Thanks

Is the calculation of covariance wrong?such as sigma = np.cov(act, rowvar=False)?

Hi
the output of function "get_activations_from_files" is "A numpy array of dimension (num images, 2048) ..." .
That is to say,the "act" is "A numpy array of dimension (num images, 2048) ".
In the formula "sigma = np.cov(act, rowvar=False)",the rowvar is False.

But I think the rowvar should be True because the first dimension of "act" is "num images",so the dimension of "sigma" should be (num images,num images) instead of (2048,2048).

Variable not defined when running --lowprofile flag and reshape to batch_size issue

Hello,

I was experimenting with the --lowprofile flag but an error is thrown saying 'n_images' from line 226 of fid.py is not defined. Likely a typo since the author probably meant to reference n_imgs.

Additionally, line 234 pred_arr[start:end] = pred.reshape(batch_size,-1) will crash if the number of images is not divisible by batch_size. Which is the case with 2048 images and the default batch_size of 50.

ValueError: setting an array element with a sequence.

Hi,

I'm trying to get FID score on two dataset of Images (one real world data and other is GAN generated).
Sizes: Real - 15780, Synthetic - 16000

Whenever I'm executing python fid.py or python precalc_stats_example.py I'm getting above mentioned error. I've attached the SS below:

Screenshot from 2019-07-18 15-31-18

Any inputs would be appreciated!

Inception score for cifar10

Hi,

I changed dim to 32 and ran gan_64x64.py for cifar10 dataset, but I got 'IS_mean: 7.0972', 'IS_std: 0.0815', FID: 32.4961. What's the problem?

FID score in Table I

Hi,

In Table 1, the two FID scores for CelebA are completely identical to that of SVHN. From my own experiments, they are quite different. I would like to cross-check with you if this is an editing typo. Thanks.

Where are images resized?

It appears that the code never resizes images to be the correct 299x299 for the inception model. Is it the case that all of the results on 64x64 images are obtained by feeding smaller images into the convolutional network and simply assuming that the outputs are meaningful? Or is there a resize somewhere I'm not seeing?

I also observed that resolution mattered immensely when comparing to the precomputed npz matrices in this repository. In particular, if the images were not 64x64, the FID was extremely high, so I'm assuming those npz matrices were computed by feeding 64x64 images directly into the inception graph.

About fid calculation formula

you use the following formula to calculate fid:

d^2 = ||mu_1 - mu_2||^2 + Tr(C_1 + C_2 - 2sqrt(C_1C_2)).

The code uses dot product to calculate C_1*C_2, but the original formula seems to be based on matrix multiply. Could you clarify on this?

[Not issue] Confusion in computing FID with pre-computed statistics.

I am using celebA datase where for training a GAN i preprocess the images to 64x64.
Further there are two methods mentioned to compute fid score which are :

  1. Method M1 -- run "fid.py <path_to_generated_samples> <path_to_original_images>"
  2. Method M2 -- run "fid.py <path_to_generated_samples> <path_to_statistics file>"

Now as my generated samples are of 64x64 while statistics file available for celebA dataset here is corresponding to original image size. Therefore I suspect that using M2 method will give me a wrong FID score.

Also to use M1 method, I thought of a solution to sample 10k images from celebA & store them by preprocessing the images to 64x64. Now I will have original samples (10k in numbers) which have same size as generated one i.e 64x64.

Now my confusion is what do people do in practice i.e in papers when they report FID scores which method do they use M1 or M2 ?

FID's applicability for smaller datasets

Hi,

I have a few questions about FID score:

  1. I have a dataset smaller than 2048 images, but I still want to compute FID score. I understand that 2048 images are required as to get a full rank covariance matrix. Computing the FID on my dataset still gives me sensible values, and no complex numbers or NaNs (or warnings). Can I still trust the FID score computed this way as a measure of visual quality?

  2. In this paper, section 5.1, it is noted:

We observe that FID has rather high bias, but small variance, From this perspective, estimating the full covariance matrix might be unnecessary and counter-productive, and a constrained version might suffice.

What is your take on this? Does this hint to FID being usable with smaller datasets?

  1. Do think a FID computed on lower level features maps of Inception is still meaningful? As the features at lower levels still have spatial extent, spatial pooling would have to be applied first. My thought here is to try to make FID work with smaller datasets.

I also ported your implementation to PyTorch, for people who do not want to have the Tensorflow dependency (see here). I hope that is okay with you.

Thanks!

Number of real samples

I've been reading a lot of papers regarding FID in terms of "number of samples", without really clarifying if that applies to the fake samples from the generator, the real samples, the sum, the same number for both, etc.

When you say "number of samples" or sample size, which one is it? Because I am currently generating fake samples from a very small dataset (300 images) and I want to calculate the FID between the 300 real images and fake images, but I am not sure how many samples to use. The logic tells me that I should use the same number of samples for both real and fake images but that may cause calculation problems.

Thank you in advance.

Between np.trace(whole term) and sum of np.trace(each term)

Hi, I found that you implemented the final FID calculation result in this line

return diff.dot(diff) + np.trace(sigma1) + np.trace(sigma2) - 2 * tr_covmean

But according to your formula, the trace are applied after both sigmas and covmean being calculated, so it should be in

return diff.dot(diff) + np.trace(sigma1 + sigma2 - 2 * covmean)

I have experimented using both, and it seems that the calculation are both similar (until 13 numbers behind decimal point). Any explanation on reason of using the former?

Why do I always encounter the error: 'ValueError: Imaginary component 89928690.58258057'

I use the fid.py to measure fid score of my images datasets, I generated 10000 images in tImages directory and used the command 'python fid.py ./tImages fid_stats_celeba.npz' or 'python fid.py ./tImages ./sImages ' (sImages directory is another images datasets), but waiting a while, I always get a ValueError. e.g. 'ValueError: Imaginary component 89928690.58258057' or
'ValueError: Imaginary component 1.376687186290827e+24'. I don't know which step I did wrong. Could anyone tell me what the problem is, thanks!

error information:
Traceback (most recent call last):
File "fid.py", line 334, in
fid_value = calculate_fid_given_paths(args.path, args.inception, low_profile=args.lowprofile)
File "fid.py", line 317, in calculate_fid_given_paths
fid_value = calculate_frechet_distance(m1, s1, m2, s2)
File "fid.py", line 155, in calculate_frechet_distance
raise ValueError("Imaginary component {}".format(m))
ValueError: Imaginary component 1.376687186290827e+24

Help! FID on celebA

Why is the FID score calculated using the CelebA real image 64*64 always around 27? It should be less than 5. . .The images I used are images cropped with list_bbox_celeba.txt for a total of 50,000.

fid.py is always killed on Linux?

the number of real images is 6120, and the number of generated images is 6120. but fid.py is always killed on Linux? The available memory is 27GB,batch_size= 8,is it because the memory is too small?

warnings module need be import in python3

TTUR/fid.py

Lines 140 to 142 in dc4e5b5

if not np.isfinite(covmean).all():
msg = "fid calculation produces singular product; adding %s to diagonal of cov estimates" % eps
warnings.warn(msg)

The warnings module need be import explicitly. I got a error that the name is not defined by Python3.6, although it is usually not happened because the branch is often skipped.

should fid.py mention np.float64 ?

I was trying to run fid scores for some CelebA images and kept bugging out at 20K-30K samples. After some examination, I found that the error lay in fid.py typecasting to np.float32, and this was resulting in NaN and inf values etc.

Should this be mentioned in fid.py or simply typecast to float64 ? I am not aware of any downside to float64.

Problem using image dataset already in numpy format

Hello, I have a problem in using your code on my custom dataset.

My dataset is a collection of images already in npz format, and I'm able to extract it directly without had to do imread first. But I did an initial resize since my image is in 1-channel format

dataset = np.load('dataset_file.npz')
x_test_target = dataset['img_set'].astype('float32')
x_test_target = np.reshape(x_test_target, (len(x_test_target), row, col, chann))
x_test_target.resize((row, col, 3))

After resizing, I tested it using the calculate_activation_statistics method to see if Inception able to receive it as input

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    mu_real, sigma_real = fid.calculate_activation_statistics(x_test_target, sess, batch_size=100)

But I receive this error,

ValueError: Cannot feed value of shape (100, 128, 3) for Tensor 'FID_Inception_Net/ExpandDims:0', which has shape '(?, ?, ?, 3)'

Is there something wrong with my data setup? Or the batch_size setup not suitable with my data?

Score in Table 1

I have trained "GoodGenerator" (in the provided code) with TTUR (using the given default hyperparameters) for a couple of times. My trained model achieves about 29 FID score for the CIFAR-10 dataset. To achieve the score in Table 1, should I pick the best score among multiple training and multiple generations? Or, should I change the hyperparameters? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.