Giter Site home page Giter Site logo

onenf / von Goto Github PK

View Code? Open in Web Editor NEW

This project forked from junyanz/von

0.0 1.0 0.0 1.85 MB

[NeurIPS 2018] Visual Object Networks: Image Generation with Disentangled 3D Representation.

Home Page: http://von.csail.mit.edu

License: Other

Python 62.01% Shell 3.73% C 6.21% C++ 2.56% Cuda 16.64% Jupyter Notebook 8.85%

von's Introduction

Visual Object Networks

Project Page | Paper

We present Visual Object Networks (VON), an end-to-end adversarial learning framework that jointly models 3D shapes and 2D images. Our model can synthesize a 3D shape, its intermediate 2.5D depth representation, and a 2D image all at once. The VON not only generates realistic images but also enables several 3D operations.

Visual Object Networks: Image Generation with Disentangled 3D Representation.
Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum, William T. Freeman.
MIT CSAIL and Google Research.
In NeurIPS 2018.

Example results

(a) Typical examples produced by a recent GAN model [Gulrajani et al., 2017].
(b) Our model produces three outputs: a 3D shape, its 2.5D projection given a viewpoint, and a final image with realistic texture.
(c) Given this disentangled 3D representation, our method allows several 3D applications including editing viewpoint, shape, or texture independently.

More Samples

Below we show more samples from DCGAN [Radford et al., 2016], LSGAN [Mao et al., 2017], WGAN-GP [Gulrajani et al., 2017], and our VON. For our method, we show both 3D shapes and 2D images. The learned 3D prior helps produce better samples.

3D Object Manipulations

Our Visual Object Networks (VON) allow several 3D applications such as (left) changing the viewpoint, texture, or shape independently, and (right) interpolating between two objects in shape space, texture space, or both.

Texture Transfer across Objects and Viewpoints

VON can transfer the texture of a real image to different shapes and viewpoints

Prerequisites

  • Linux (only tested on Ubuntu 16.04)
  • Python3 (only tested with python 3.6)
  • Anaconda3
  • nvcc & gcc (only tested with gcc 6.3.0)
  • PyTorch 0.4.1 (does not support 0.4.0)
  • Currently not tested with Nvidia RTX GPU series

Getting Started

Installation

  • Clone this repo:
git clone -b master --single-branch https://github.com/junyanz/VON.git
cd VON
  • Install PyTorch 0.4.1+ and torchvision from http://pytorch.org and other dependencies (e.g., visdom and dominate). You can install all the dependencies by the following:
conda create --name von --file pkg_specs.txt
source activate von
  • Compile the rendering kernel by the following:
./install.sh
  • (Optional) Install blender for visualizing generated 3D shapes. After installation, please add blender to your PATH environment variable.

Generate 3D shapes, 2.5D sketches, and images

  • Download our pretrained models:
bash ./scripts/download_model.sh
  • Generate results with the model
bash ./scripts/figures.sh 0 car df

The test results will be saved to an HTML file here: ./results/*/*/index.html.

Model Training

  • To train a model, download the training dataset (distance functions and images). For example, if we would like to train a car model with distance function representation on GPU 0.
bash ./scripts/download_dataset.sh
  • To train a 3D generator:
bash ./scripts/train_shapes.sh 0 car df
  • To train a 2D texture network using ShapeNet real shapes:
bash ./scripts/train_stage2_real.sh 0 car df
  • To train a 2D texture network using pre-trained 3D generator:
bash ./scripts/train_stage2.sh 0 car df
  • Jointly finetune 3D and 2D generative models:
bash ./scripts/train_full.sh 0 car df
  • To view training results and loss plots, go to http://localhost:8097 in a web browser. To see more intermediate results, check out ./checkpoints/*/web/index.html

Citation

If you find this useful for your research, please cite the following paper.

@inproceedings{VON,
  title={Visual Object Networks: Image Generation with Disentangled 3{D} Representations},
  author={Jun-Yan Zhu and Zhoutong Zhang and Chengkai Zhang and Jiajun Wu and Antonio Torralba and Joshua B. Tenenbaum and William T. Freeman},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2018}
}

Acknowledgements

This work is supported by NSF #1231216, NSF #1524817, ONR MURI N00014-16-1-2007, Toyota Research Institute, Shell, and Facebook. We thank Xiuming Zhang, Richard Zhang, David Bau, and Zhuang Liu for valuable discussions. This code borrows from the CycleGAN & pix2pix repo.

von's People

Contributors

junyanz avatar ztzhang avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.