Giter Site home page Giter Site logo

multi_view_ram's Introduction

3D Attention-Driven Depth Acquisition for Object Identification

By Kai Xu, Yifei Shi, Lintao Zheng, Junyu Zhang, Min Liu, Hui Huang, Hao Su, Daniel Cohen-Or, Baoquan Chen

Introduction

This code is a Torch implementation of an end-to-end approach for 3D attention model that selects the best views to achieve efficient object recognition. Details of the work can be found here.

Citation

If you find our work useful in your research, please consider citing:

@article {xu_siga16,
title = {3D Attention-Driven Depth Acquisition for Object Identification},
    author = {Kai Xu and Yifei Shi and Lintao Zheng and Junyu Zhang and Min Liu and Hui Huang and Hao Su and Daniel Cohen-Or 	 and Baoquan Chen},
 	journal = {ACM Transactions on Graphics (Proc. of SIGGRAPH Asia 2016)},
	volume = {35},
	number = {6},
	pages = {to appear},
	year = {2016}
}

Requirements

  1. This code is written in lua and requires Torch. You should setup torch environment.

  2. if you'd like to train on GPU/CUDA, you have to get the cutorch and cunn packages:

    $ luarocks install cutorch	
    $ luarocks install cunn
  1. Install matio: $ sudo apt-get install libmatio2

  2. Install other torch packages (nn, dpnn, rnn, image, etc.): $ ./scripts/dependencies_install.sh

Usage

Data

Here we give a small dataset to show our demo. the dataset contains five classes (chair, display, flowerpot, guita, table), each of which consists of 300 models. Each 3D model is rendered into a basic set of 2.5D depth images from 21 sampled views, serving as multi-veiw training data. We split train and test set according the ratio 5:1 for each class. The hierachy tree has been build and placed in the folder data_hierarchy_tree. In each node folder, there exists a folder named mvcnn that contains a mvcnn net, and a folder named cur_model that contains a MV-RNN model for current node. A matlab format file .mat is used as training data for current node, and each subclass folder is sub-node.

Train

To train a MV-RNN model to classify object for root node:

$ th train.lua 

Run th train.lua -h to see additional command line options that may be specified.

If you want to train hierarchy MV-RNN models for every node of all classes, run: ./scripts/train_hierarchy_mvrnn.sh

Evaluation

We have trained all models(MV-RNN models) for every node of class chair(subclass1), you can see the evaluation results following opeartions below. To evulate the MV-RNN model for the root node:

$ th eval_demo.lua 

You can see retrive examples by running:

$ th retrive_demo.lua

the results are saved in the folder retrive_res. (note: if encounter an error due to ViewSelect.lua, you can fix it by uncommenting the line 35 in ViewSelect.lua)

Example output by retrive_demo.lua


1. Example of ten views comparision betwwen input and retrive data

first row for input data, second row for retrive data

(1)

(2)

2. Example of view sequence
(1)

(2)

Acknowledgement

Torch implementation in this repository is based on the code from Nicholas Leonard's recurrent model of visual attention, which is a clean and nice GitHub repo using Torch.

multi_view_ram's People

Contributors

kevin-kaixu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.