openai / ebm_code_release Goto Github PK

Code for Implicit Generation and Generalization with Energy Based Models

Home Page: https://sites.google.com/view/igebm

Python 100.00%

ebm_code_release's Introduction

Implicit Generation and Generalization in Energy Based Models

Code for Implicit Generation and Generalization in Energy Based Models. Blog post can be found here and website with pretrained models can be found here.

Requirements

To install the prerequisites for the project run

pip install -r requirements.txt
mkdir sandbox_cachedir

Download all pretrained models and unzip into the folder cachedir.

Download Datasets

For MNIST and CIFAR-10 datasets, the code will directly download the data.

For ImageNet 128x128 dataset, download the TFRecords of the Imagenet dataset by running the following command

for i in $(seq -f "%05g" 0 1023)
do
  wget https://[deprecated]/data/imagenet/train-$i-of-01024
done

for i in $(seq -f "%05g" 0 127)
do
  wget https://[deprecated]/data/imagenet/validation-$i-of-00128
done

wget https://[deprecated]/data/imagenet/index.json

For Imagenet 32x32 dataset, download the Imagenet 32x32 dataset and unzip by running the following command

wget https://[deprecated]/data/imagenet32/Imagenet32_train.zip
wget https://[deprecated]/data/imagenet32/Imagenet32_val.zip

For dSprites dataset, download the dataset by running

wget https://github.com/deepmind/dsprites-dataset/blob/master/dsprites_ndarray_co1sh3sc6or40x32y32_64x64.npz?raw=true

Training

To train on different datasets:

For CIFAR-10 Unconditional

python train.py --exp=cifar10_uncond --dataset=cifar10 --num_steps=60 --batch_size=128 --step_lr=10.0 --proj_norm=0.01 --zero_kl --replay_batch --large_model

For CIFAR-10 Conditional

python train.py --exp=cifar10_cond --dataset=cifar10 --num_steps=60 --batch_size=128 --step_lr=10.0 --proj_norm=0.01 --zero_kl --replay_batch --cclass

For ImageNet 32x32 Conditional

python train.py --exp=imagenet_cond --num_steps=60  --wider_model --batch_size=32 step_lr=10.0 --proj_norm=0.01 --replay_batch --cclass --zero_kl --dataset=imagenet --imagenet_path=<imagenet32x32 path>

For ImageNet 128x128 Conditional

python train.py --exp=imagenet_cond --num_steps=50 --batch_size=16 step_lr=100.0 --replay_batch --swish_act --cclass --zero_kl --dataset=imagenetfull --imagenet_datadir=<full imagenet path>

All code supports horovod execution, so model training can be increased substantially by using multiple different workers by running each command.

mpiexec -n <worker_num>  <command>

Demo

The imagenet_demo.py file contains code to experiments with EBMs on conditional ImageNet 128x128. To generate a gif on sampling, you can run the command:

python imagenet_demo.py --exp=imagenet128_cond --resume_iter=2238000 --swish_act

The ebm_sandbox.py file contains several different tasks that can be used to evaluate EBMs, which are defined by different settings of task flag in the file. For example, to visualize cross class mappings in CIFAR-10, you can run:

python ebm_sandbox.py --task=crossclass --num_steps=40 --exp=cifar10_cond --resume_iter=74700

Generalization

To test generalization to out of distribution classification for SVHN (with similar commands for other datasets)

python ebm_sandbox.py --task=mixenergy --num_steps=40 --exp=cifar10_large_model_uncond --resume_iter=121200 --large_model --svhnmix --cclass=False

To test classification on CIFAR-10 using a conditional model under either L2 or Li perturbations

python ebm_sandbox.py --task=label --exp=cifar10_wider_model_cond --resume_iter=21600 --lnorm=-1 --pgd=<number of pgd steps> --num_steps=10 --lival=<li bound value> --wider_model

Concept Combination

To train EBMs on conditional dSprites dataset, you can train each model seperately on each conditioned latent in cond_pos, cond_rot, cond_shape, cond_scale, with an example command given below.

python train.py --dataset=dsprites --exp=dsprites_cond_pos --zero_kl --num_steps=20 --step_lr=500.0 --swish_act  --cond_pos --replay_batch -cclass

Once models are trained, they can be sampled from jointly by running

python ebm_combine.py --task=conceptcombine --exp_size=<exp_size> --exp_shape=<exp_shape> --exp_pos=<exp_pos> --exp_rot=<exp_rot> --resume_size=<resume_size> --resume_shape=<resume_shape> --resume_rot=<resume_rot> --resume_pos=<resume_pos>

ebm_code_release's People

Contributors

Stargazers

Watchers

Forkers

jdc08161063 kingofspace0wzz xvshiting wyfzidane xyli1905 chunde wangyongguang stjordanis leefree-git huizhang2017 chaoyue729 ii0 kelvinson sharp-g caihengyu520 gitsamshi chesternimiz afcarl sushantjha8 christopher-beckham yjingyu justin-ibc bluematrix007 lliai austinkeller phymhan kth0522 abollo ruiqigao swyoon yuriykolesnikov gtesei bbeatrix jinayshah7 siyuwang15 milkigit neotim jxzhangjhu global-localhost global19 global19-atlassian-net classicvalues isabella232 hanjun-dai tianqi-zhu lyndonlens linnetfire ayoubjadouli a-why-not-fork-repositories-good-luck joolstorrentecalo qsimeon goompean metamorphart budiholn ethicalsecurity-agency ghas-results zixind seanpm2001 ghas-results dearborn-open-ai kk-digital

ebm_code_release's Issues

Langevin gradient step size in code vs in paper

In the paper the langevin dynamics scales the gradient as 0.5 * λ + w sampled from normal distribution with 0 mean and λ standard deviation. However, in the code (e.x. imagenet_demo.py), gradient is multiplied by step_lr which is 180 by default and the noise is generated with 0.005 standard deviation. Am I missing something or the paper shows equation that is justified by theory but in practice it is better to scale energy gradient different than noise?

Draw samples from CIFAR10

Could you kindly guide me on how to generate samples from the CIFAR10 pre-trained model, similar to the process in imagenet_demo.py? @yilundu

AUROC computation

In section 4.4 "Out-of-Distribution Generalization" how do you compute the AUROC scores from the EBM model? The score function is un-normalized, as opposed to the likelihood which is between 0 and 1. So how is the AUROC score computed then?

Instructions for likelihood evaluation using AIS?

Hi,

Thanks for making your code available!

I'm having problem evaluating log likelihood on the MNIST dataset. When I run

python ais.py --exp exp-name --dataset mnist  --resume_iter 15000 --pdist 5000

the output is like

Positive sample probability  -0.00042194081 0.0062082466
Total values of lower value based off forward sampling -0.62608874 0.0075839674
Total values of upper value based off backward sampling -0.6260886 0.007481599

It seems that the code should output the test log likelihood, but the magnitude of the output doesn't match (the numbers in Fig 22 of your paper are around 10^2 to 10^3). This is strange since I've checked visually that the model is generating plausible samples. So is there anything wrong in my experiment configuration, or should I do some scaling to obtain the log likelihood figures in the table?

The model is obtained with

 --dataset=mnist --num_steps=60 --batch_size=128 --step_lr=10.0 --proj_norm=0.01 --zero_kl --replay_batch --epoch_num 50

Dropbox link to pretrained models is broken

The following link returns a 404 error.

https://www.dropbox.com/s/g1jwagoofhrjwvy/cachedir.tar.gz?dl=0

does self attention improve ebm?

I've found that the repo has already implemented self-attention. Have the authors tried using self-attention during training ebm and does self-attention improve ebm? Looking forward to your reply. Thanks in advance.

Baselines package out of sync with rest of repository

I have attempted to install the packages from requirements.txt but they are seemingly incompatible. Moreover, the baselines requirement is out of sync with the others. After downgrading to CUDA 9.0, and install CUDNN 7.1.4 with Tensorflow 1.12.0 and Torch 0.3.4 it seems that this code still will not run. Is there an updated version that can use the latest packages? Is there a workaround for needing to install the "baselines" directory? Is there anybody who has a docker container for these packages, or a recently used python virtual environment?

ebm for concept learning

Dear openAI researchers,
Thanks for your code release of ebm. I've been really interested in energy-based models since I read Ignor Mordatch's paper Concept Learning with energy-based models last year. However it is difficult to reproduce the experiments since there's no code or dataset release for that work. Is openAI still working on that topic and can you release a single case implementation of that work? I've sent some e-mails but there's no reply. Looking forward to your help. Thank you.

Url for pretrained model broken

The current url leads to a 404. Could you please fix it?

FLAG.objective mode logsumexp and softplus

Hi,
I am curious what is the math background for the logsunexp and softplus case in FLAG.objective.

Thanks.