Giter Site home page Giter Site logo

gramuah / ipwt Goto Github PK

View Code? Open in Web Editor NEW
1.0 4.0 0.0 40.63 MB

In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization

License: GNU General Public License v3.0

MATLAB 18.81% Makefile 0.28% C 0.47% C++ 35.10% Objective-C 0.01% M 0.04% Mathematica 0.03% Mercury 0.01% CMake 1.10% Shell 0.24% HTML 0.07% CSS 0.10% Jupyter Notebook 37.65% Python 3.54% Cuda 2.56% Limbo 0.01%

ipwt's Introduction

In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization

This is a repository with the original implementation of the described solutions in our IROS 2018 paper.

License

This repository is released under the GNU General Public License (refer to the LICENSE file for details).

Citing

If you make use of this data and software, please cite the following reference in any publications:

@Inproceedings{herranz2018,
Author = {Herranz-Perdiguero, C. and Redondo-Cabrera, C. and L\'opez-Sastre, R.~J.},
Title = {In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization},
Booktitle = {IROS},
Year = {2018}
}

Datasets and initial setup

  1. In order to use this software, clone this repository. We will call that directory PROJECT_ROOT. There, we will have three folders:
  • PROJECT_ROOT/deeplab_NYU contains the code needed to perform semantic segmentation.
  • PROJECT_ROOT/NYU_depth_v2 contains the NYU Depth v2 dataset.
  • PROJECT_ROOT/Classification contains the code needed to perform scene classification.
  1. Download the NYU Depth v2 dataset. To do so, go to PROJECT_ROOT/NYU_depth_v2 and run the following script:
  cd $PROJECT_ROOT/NYU_depth_v2
  ./Download_NYU_depth.sh

The dataset is saved in a .mat file which is located in $PROJECT_ROOT/NYU_depth_v2/NYUdepth

  1. Now, to download a pretrained model to initialize the network and the best of our models, run:
   cd $PROJECT_ROOT/NYU_depth_v2
  ./Download_Models.sh

You can find the file init.caffemodel in $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet. We will use that model to initialize the weights of the segmentation network. In $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet/trained_model you can find the final model used in our IROS 2018 paper.

  1. Finally we need to extract the images from the .mat file we have downloaded. Under PROJECT_ROOT/NYU_depth_v2 just run the Matlab script ObtainImages.m. Also, to overcome memory limitations, we will split the images in two halves with overlap. We can do that by running SplitWholeDataset.m.

Image Segmentation

  1. We need first to compile the DeepLab based model. Run:
  cd $PROJECT_ROOT/deeplab_NYU
  make
  make pycaffe
  make matcaffe
  1. Once the compilation has been successfully completed, run:
  cd $PROJECT_ROOT/deeplab_NYU
  python run_NYUdepth.py

to train or test a model. Please, note that in the script there is a flag to choose between train or test. By default, we search in $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet for a .caffemodel to initialize the weights of the network in the training step, or to find the model to use during testing.

  1. To visualize the results after the test has been completed, run the Matlab script GetSegResults_Split.m with do_save_results to '1' in $PROJECT_ROOT/deeplab_NYU/matlab/my_script_NYUdepth. We will need these results to perform the scene classification task.

  2. To evaluate the results using the different metrics reported in our paper run: Get_mIOU.m in $PROJECT_ROOT/deeplab_NYU/matlab/my_script_NYUdepth

Scene classification

Once we have the results from our segmentation model, we can use them to categorize different scenes following the approach described in our paper. Simply:

  1. In PROJECT_ROOT/Classification/histograms, run Generate_histograms.m to generate the histogram-like features from the segmented images. The code allows to choose the number of spatial pyramid levels you want to play with.

  2. Under PROJECT_ROOT/Classification/SVM, you can find the models used to perform scene classification with both a Linear SVM and an Additive Kernel SVM. You just need to run run_SVM.m or run_addtive_SVM.m

ipwt's People

Contributors

cherranzp avatar rlopezsastre avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar  avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.