Giter Site home page Giter Site logo

deep-learning-course-project's Introduction

Deep-Learning-Course-project

Deep Learning Course project based on VON (Project Page | Paper )

Visual Object Networks: Image Generation with Disentangled 3D Representation.
Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum, William T. Freeman.
MIT CSAIL and Google Research.
In NeurIPS 2018.

Prerequisites

  • Linux (tested on Ubuntu 16.04/ 18.04)
  • Python3.6
  • Anaconda3
  • NVCC & GCC (tested with gcc 5.4.0)
  • PyTorch 0.4.1 (does not support 0.4.0)
  • CUDA 9.0
  • Currently (tested with Nvidia RTX GPU series)

Installation

  • Install PyTorch 0.4.1+ and torchvision from http://pytorch.org and other dependencies (e.g., visdom and dominate). You can install all the dependencies by the following:
conda create --name von --file pkg_specs.txt
source activate von
  • Compile the rendering kernel by running the following:
bash install.sh

Model Training

  • Download the training dataset (distance functions and images)
wget http://von.csail.mit.edu/data/data.tar
  • To train a 3D generator:
python train.py --gpu_ids ${GPU_IDS} \
                  --display_id 1000 \
                  --dataset_mode df \
                  --model 'shape_gan' \
                  --class_3d ${CLASS} \
                  --checkpoints_dir ${CHECKPOINTS_DIR} \
                  --niter 250 --niter_decay 250 \
                  --batch_size 8 \
                  --save_epoch_freq 10 \
                  --suffix {class_3d}_{model}_{dataset_mode}

Specify the GPU_ID, CLASS (car or chair), checkpoints_dir.

  • To train a 2D texture network using ShapeNet real shapes:
python train.py --gpu_ids ${GPU_IDS} \
  --display_id 1000 \
  --dataset_mode 'image_and_'${DATASET} \
  --model 'texture_real' \
  --checkpoints_dir ${CHECKPOINTS_DIR} \
  --class_3d ${CLASS} \
  --random_shift --color_jitter \

Specify the GPU_ID, CLASS (car or chair), checkpoints_dir.

  • To test the model by generating shapes and images:
python test.py --gpu_ids ${GPU_IDS} \
  --results_dir ${RESULTS_DIR} \
  --model2D_dir ${MODEL2D_DIR} \
  --model3D_dir ${MODEL3D_DIR} \
  --class_3d ${CLASS} \
  --phase 'val' \
  --dataset_mode 'image_and_df \
  --model 'test'  \
  --n_shapes ${NUM_SHAPES} \
  --n_views ${NUM_SAMPLES} \
  --reset_texture \
  --reset_shape \
  --suffix ${CLASS}_${DATASET}\
  --render_25d --render_3d

Specify the GPU_ID, RESULTS_DIR, MODEL2D_DIR, MODEL3D_DIR, CLASS, NUM_SHAPES, NUM_SAMPLES If you are using our pretrained models, please specify MODEL2D_DIR as './checkpoints/0411models/models_2D/car_df/latest', and MODEL3D_DIR as './checkpoints/0411models/models_3D/car_df'. (Currently only Car is supported, and there are some results under /results folder)

  • To calculate the FID score for generated 2D images:
cd fid
fid_score.py [PATH_FOR_REAL_IMAGES] [PATH_FOR_GENERATED_IMAGES]

Model Testing: shape and texture interpolation

  • To interpolate the 3D shapes:
python test_shape.py --gpu_ids ${GPU_IDS} \
    --checkpoints_dir ${CHECKPOINTS_DIR} \
    --model 'shape_gan' \
    --batch_size 16 \
    --n_shapes 32 \
    --interp_shape

Specify the GPU_ID, CHECKPOINTS_DIR. If you are using our pretrained models, please specify MODEL2D_DIR as './checkpoints/0411models/models_3D/car_df/'

  • To interpolate the textures:
python test.py --gpu_ids ${GPU_IDS} \
  --results_dir ${RESULTS_DIR} \
  --model2D_dir ${MODEL2D_DIR} \
  --model3D_dir ${MODEL3D_DIR} \
  --class_3d ${CLASS} \
  --phase 'val' \
  --dataset_mode 'image_and_'${DATASET} \
  --model 'test'  \
  --seed 10 \
  --n_shapes ${NUM_SHAPES} \
  --n_views ${NUM_SAMPLES} \
  --reset_shape\
  --suffix ${CLASS}_${DATASET}_t{real_texture} ${4}\
  --interp_texture\

Some interpolation results are under ./interpolation_results folder.

Citation

If you find this useful for your research, please cite the following paper.

@inproceedings{VON,
  title={Visual Object Networks: Image Generation with Disentangled 3{D} Representations},
  author={Jun-Yan Zhu and Zhoutong Zhang and Chengkai Zhang and Jiajun Wu and Antonio Torralba and Joshua B. Tenenbaum and William T. Freeman},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2018}
}

Acknowledgements

This repository is for educational use only. The code borrows from VON.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.